AlphaAgentEvo: Evolution-Oriented Alpha Mining via Self-Evolving Agentic Reinforcement Learning
Abstract
Alpha mining seeks to identify predictive alpha factors that generate excess returns beyond the market from a vast and noisy search space; however, existing approaches struggle to facilitate the systematic evolution of alphas. Traditional methods, such as genetic programming, are unable to interpret human natural-language instructions and often fail to extract valuable insights from unsuccessful attempts, leading to low interpretability and inefficient exploration. Analogously, without mechanisms for systematic evolution, e.g., long-term planning and reflection, multi-agent approaches may easily fall into repetitive evolutionary routines, thereby failing to realize efficient self-evolution. To overcome these limitations, we introduce AlphaAgentEvo, a self-evolving Agentic Reinforcement Learning (ARL) framework for alpha mining, which moves alpha mining beyond the brittle “search–backtest–restart” cycle toward a continuous trajectory of evolution. Instructed by a hierarchical reward function, our agent engages in self-exploration of the search space, progressively learning basic requirements (e.g., valid tool calls) and then harder objectives (e.g., continuous performance improvements). Through this process, the agent acquires advanced behaviors such as long-horizon planning and reflective reasoning, which enable it to actively react to the underlying state (e.g., market regime shifting) and realize a self-evolving agent, taking a step toward more principled and scalable alpha mining. Extensive experiments demonstrate that AlphaAgentEvo achieves more efficient alpha evolution and generates more diverse and transferable alphas, consistently surpassing a wide range of baselines. Notably, with only 4B parameters, it outperforms LLM-driven evolution methods configured with state-of-the-art close-source reasoning models, highlighting the promise of ARL for next-generation alpha mining.