Skip to yearly menu bar Skip to main content


Poster

Multi-LLM-Agents Debate - Performance, Efficiency, and Scaling Challenges

Hangfan Zhang · Zhiyao Cui · Qiaosheng Zhang · Shuyue Hu

Hall 3 + Hall 2B #552
[ ]
Fri 25 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

Multi-Agent Debate (MAD) explores leveraging collaboration among multiple large language model (LLM) agents to improve test-time performance without additional training. This blog evaluates five MAD frameworks across nine benchmarks, revealing that current MAD methods fail to consistently outperform simpler single-agent strategies, even with increased computational resources. Analysis of factors such as agent configurations and debate rounds suggests that existing MAD designs fall short in fully utilizing additional inference-time computation.

Live content is unavailable. Log in and register to view live content