Code Driven Game Theoretic Evolution of LLM Agents as Holistic Strategy Generators
Abstract
Game-theoretic settings provide a potent testbed for the strategic reasoning of Large Language Models (LLMs). However, current evaluations are largely constrained to static, single-shot or myopic, per-timestep interactions, leaving the capacity for holistic, long-term policy formulation unexamined. We introduce a novel LLM-driven evolutionary game tournament framework to investigate emergent, far-sighted strategy. In our paradigm, the LLM is repositioned as a high-level strategy generator, producing complete, interpretable Python policies. These policies undergo iterative refinement against a dynamic Hall of Fame (HoF) as the LLM analyzes tournament performance to generate superior variants, directing the ecosystem's evolution. Extensive experiments across five distinct LLM architectures and multiple random seeds reveal that this process enables LLMs to consistently discover robust cooperative policies outperforming standard reciprocity-based algorithms and to autonomously generate complex deceptive strategies, exhibiting distinct phases of strategic disguise, inducement, and payoff harvesting. These findings confirm that such strategic evolution is a robust, model-agnostic phenomenon.