Stress-Adaptive Belief Control for Agentic Decision Systems
Abstract
Financial decision systems operate under persistent non-stationarity, where prolonged stress, regime shifts, and structural breaks undermine the assumptions underlying both classical portfolio methods and modern reinforcement learning agents. Existing approaches typically regulate policies or rewards while leaving inference dynamics fixed, leading to overconfidence in stable periods and belief collapse during crises. We propose a stress-adaptive belief control mechanism that explicitly constrains inference rather than policy updates. The method introduces a KL-anchored trust region that prevents abrupt belief shifts, coupled with an adaptive entropy term that modulates inference bandwidth in response to market stress. This yields a stable, interpretable belief process that widens under crisis conditions and contracts as structure re-emerges. The same belief dynamics serve a dual role, conditioning downstream agentic policies and enabling the generation of stress-sensitive scenarios with realistic persistence and recovery behavior. We evaluate the framework across synthetic regime-switching environments, multi-armed bandits, and long-horizon financial data, demonstrating improved robustness, calibration, and post-crisis recovery relative to entropy-regularized reinforcement learning and regime-agnostic baselines. By treating inference bandwidth as a controllable quantity, this work provides a unified foundation for stress-aware learning and responsible agentic decision-making under structural uncertainty.