VerAct: A Two-Layer Architecture for Provably Safe LLM Agent Planning
Nguyen Vu Nguyen
Abstract
LLM agents violate safety constraints in ways that cause irreversible harm: exceeding medication limits, breaching financial thresholds, ignoring operational prerequisites. The obvious fix (having LLMs verify their own actions) fails; our experiments show LLM-based verification degrades performance by 41% through temporal confusion (62.9%) and arithmetic errors (33.5%). We present VerAct, which separates action proposal (neural) from safety verification (symbolic). Across 28,080 episodes with four LLMs, VerAct achieves 80.1% success with zero constraint violations, while code-generation guardrails achieve only 15.4%. Safety-critical agents require architectural separation of neural creativity and symbolic verification.
Chat is not available.
Successful Page Load