Extracting Model Precision from 20 Logprobs
Yiming Zhang · Javier Rando · Florian Tramer · Daphne Ippolito · Nicholas Carlini
Abstract
We demonstrate that the internal floating-point precision of language models can be inferred from API-exposed logprobs. Our key insight is that log-softmax shifts all logits by a shared constant, and we can search for shift values that map logprobs back to values representable in a given precision. Using just 20 logprobs from a single API call, we can reliably distinguish FP32, BF16, FP16, and FP8 formats. Applying our method to production APIs, we find that older OpenAI models (GPT-3.5, GPT-4) use FP32 logits while newer models (GPT-4o, GPT-4.1) use BF16, and Gemini 2.0 uses FP32.
Successful Page Load