Miscalibrated Belief Updates in LLM Agents under Strategic Uncertainty
Harry Ilanyan ⋅ Krish Jain
Abstract
We examine whether an LLM can scale belief updates with evidence strength in a strategic environment, and find that Llama-3.1-70B exhibits systematic failures. Using a heads-up poker environment with reference Bayesian oracles, we compare elicited LLM beliefs against a card-only posterior (a combinatorial prior) and a strategy-aware posterior (a Bayesian update incorporating opponent actions). Across 1,084 elicited beliefs, Llama-3.1-70B beliefs remain closer to the card-only baseline than to the strategy-aware posterior ($\Delta = 0.014$ Jensen–Shannon distance, 95\% CI $[0.011, 0.017]$). We show severe base-rate neglect: the model assigns a 17\% probability to "trash" hands versus the oracle's 66\% ($\approx$4$\times$ underweight). The model attempts to update beliefs, but the updates are weakly coupled to the Bayesian signal ($r \approx 0.06$) and inflated in magnitude (3-6$\times$). These findings suggest belief inertia in Llama-3.1-70B, with near-fixed-magnitude updates that are largely independent of evidence strength. This highlights potential risks of deploying language-model agents in mechanism-design settings that require calibrated belief formation.
Chat is not available.
Successful Page Load