Joint Consistency: A Unified Test-Time Aggregation Framework with Pairwise Comparisons
Hongye Wang ⋅ Yunzhen Yao ⋅ Yahong Wang ⋅ Michael Gastpar ⋅ Bo Jiang ⋅ Lie He
Abstract
This paper proposes Joint Consistency (JC), a unified framework for test-time aggregation that jointly models independent trace-level evaluations and pairwise comparisons. We cast JC as a constrained Ising-type energy minimization problem, which subsumes a broad class of existing aggregation schemes. We instantiate JC with LLM-as-a-judge comparative signals, characterize its theoretical behavior, and develop efficient approximations for practical deployment. Experiments on challenging math reasoning benchmarks show that JC outperforms state-of-the-art baselines across diverse architectures and trace budgets with marginal computational overhead, especially in crowdsourced settings.
Chat is not available.
Successful Page Load