When Fuzzing Becomes Agentic: Semantic State Exploration in the Wild
Andrew Yin ⋅ Zhaoling Chen ⋅ Qian Zhang ⋅ Heng Yin
Abstract
Agentic AI systems increasingly navigate through stateful, rule-bounded software programs, where meaningful behaviors surface only after long sequences of valid, context-aware interactions. Current evaluation methods are often outcome-focused and provide limited visibility into whether an agent has reached novel internal states. Traditional automated testing, however, frequently assumes a stateless environment and generates invalid interactions that violate consistency constraints. We propose a closed-loop agentic testing framework (ASA-Fuzz) for stateful programs that treats software fuzzing as an agentic, decision making problem. This framework (1) infers and instruments state-relevant runtime signals to provide a semantic notion of progress, (2) synthesizes context-aware operators to generate valid multi-step interactions conditioned on observed state, and (3) designed to adapt when exploration plateaus. In a case study on the rule-based, stateful chess environment, ASA-Fuzz discovered over two orders of magnitude more unique board states and achieved a 3.5$\times$ to 12$\times$ greater mean effective move depth than industry-standard (AFL++), state-aware (SGFuzz), and LLM-driven (Fuzz4All) baselines.
Chat is not available.
Successful Page Load