Beyond Self-Refinement: Ensembling and Chaining for Neurosymbolic Reasoning
Abstract
Neurosymbolic reasoning systems combine neural language models with sym- bolic solvers to produce faithful logical inference. We investigate whether itera- tive refinement or diversity-based ensembling more effectively improves Logic- LLM on FOLIO. Using GPT-4 and Prover9, we find that solver-guided self- refinement does not improve accuracy in our runs, saturating at 77.94%. In con- trast, selection-based methods provide consistent gains: a hybrid selector over original and refined programs and an uncertainty-aware ensemble both reach 79.90%. We further propose a chain-to-logic pipeline that converts multiple rea- soning chains into logic programs and aggregates them via pass@k, achieving 84.31% accuracy at pass@3. Our results show that diversity and selective ensem- bling are more effective than iterative repair for improving neurosymbolic reason- ing.