The Epistemic Cost of Preference Optimization
Rian Atri
Abstract
Stateful LLMs often exhibit contradictions across related questions and struggle with logical deduction and abductive revision under explicit evidence. We study the influence of preference pressure and evidence access on epistemic consistency to avoid contradictions across multiple related questions. To quantify these effects, we propose a minimal “pressure ladder” and “evidence toggle” evaluation protocol. We demonstrate the feasibility of this protocol through an empirical pilot and a reporting-gap audit. Finally, we propose a reporting checklist for logical reasoning papers to track epistemic consistency, evidence sensitivity, grounding/citation integrity, and a calibration proxy alongside standard metrics.
Chat is not available.
Successful Page Load