Poster
in
Workshop: Agentic AI in the Wild: From Hallucinations to Reliable Autonomy

VerAct: A Two-Layer Architecture for Provably Safe LLM Agent Planning

Nguyen Vu Nguyen

Project Page [ OpenReview]

Abstract

LLM agents violate safety constraints in ways that cause irreversible harm: exceeding medication limits, breaching financial thresholds, ignoring operational prerequisites. The obvious fix (having LLMs verify their own actions) fails; our experiments show LLM-based verification degrades performance by 41% through temporal confusion (62.9%) and arithmetic errors (33.5%). We present VerAct, which separates action proposal (neural) from safety verification (symbolic). Across 28,080 episodes with four LLMs, VerAct achieves 80.1% success with zero constraint violations, while code-generation guardrails achieve only 15.4%. Safety-critical agents require architectural separation of neural creativity and symbolic verification.

Chat is not available.