Blog Track Poster

Extracting Model Precision from 20 Logprobs

Yiming Zhang ⋅ Javier Rando ⋅ Florian Tramer ⋅ Daphne Ippolito ⋅ Nicholas Carlini

[ OpenReview]

Abstract

We demonstrate that the internal floating-point precision of language models can be inferred from API-exposed logprobs. Our key insight is that log-softmax shifts all logits by a shared constant, and we can search for shift values that map logprobs back to values representable in a given precision. Using just 20 logprobs from a single API call, we can reliably distinguish FP32, BF16, FP16, and FP8 formats. Applying our method to production APIs, we find that older OpenAI models (GPT-3.5, GPT-4) use FP32 logits while newer models (GPT-4o, GPT-4.1) use BF16, and Gemini 2.0 uses FP32.

Video

Chat is not available.