MAD-Logic: Multi-Agent Debate Enhances Symbolic Translation and Reasoning
Abstract
Large language models (LLMs) struggle with complex logical reasoning. Previous methods can be briefly summarized into two pipelines: (1) translating natural language (NL) to symbolic language (SL) then reasoning via external solvers, and (2) adopting LLMs to reason directly in NL based on prompting or fine-tuning. However, we point out that on the one hand, the translation relying on a specific SL often fails to capture different important features of raw NL, leading to information loss or translation errors. On the other hand, both two pipelines have unignorable limitations. For example, the former (SL-based) methods are highly sensitive to imperfect translation, and the latter (NL-based) methods are prone to hallucinations. Motivated by this, we are the first to propose a multi-agent debate framework to leverage the strengths of different SLs and reasoning methods, achieving better performance in both translation and reasoning stages. Specifically, in the translation stage, multiple agents translate the NL into different SL and refine translations through debate. In the reasoning stage, multiple agents based on SL (obtained by the corresponding solver) and NL debate multiple rounds, with the final answer determined by majority vote. In addition, to address the inefficiency of multi-agent debates, we introduce an adaptive sparse communication strategy that prunes unnecessary interactions based on agent confidence and information gains. Extensive experiments on three datasets show that our method enhances logical QA performance while reducing computational cost.