Certified Coherent Reasoning for LLMs via Weighted MaxSAT and Belief-Revision Geometry
Abstract
Large language models (LLMs) frequently produce mutually incompatible answers across sets of related questions, limiting reliability in domains that require globally consistent deduction, abduction, and theory maintenance. We formalize coherent multi-query answering as an optimization problem: an LLM supplies candidate answers with preference scores, logic supplies global feasibility constraints, and we select an answer set maximizing preference subject to satisfiability. Our main technical contribution is a proof-carrying coherence decoder that compiles the selection problem to weighted partial MaxSAT, outputs a globally consistent answer set, and emits certificates enabling independent verification of feasibility (and optionally optimality). We introduce solver-grounded coherence metrics, including the coherence gap (log-preference sacrificed to achieve global satisfiability) and minimal-change repair distance (belief-revision style). Theoretically, we give equivalences to distance-based belief revision (including a Dalal-style specialization), a complexity landscape (NP-hardness under severe restrictions), and a probabilistic extension as a KL projection onto the coherent support set. We also propose benchmark templates (pure symbolic and controlled natural language) in which the semantic map to logic is unambiguous, enabling clean measurement of global coherence, contradiction localization, and certified repair.