Poster
in
Workshop: Building Trust in LLMs and LLM Applications: From Guardrails to Explainability to Regulation
THE FUNDAMENTAL LIMITS OF LLM UNLEARNING: COMPLEXITY-THEORETIC BARRIERS AND PROVABLY OPTIMAL PROTOCOLS
Aviral Srivastava
Modern machine unlearning techniques for large language models (LLMs) re-main heuristic, lacking formal characterization of their fundamental computa-tional limits. We establish the first complexity-theoretic foundation for LLM un-learning, revealing intrinsic tradeoffs between efficiency, precision, and regulatorycompliance. Our framework formalizes (ϵ, δ)-machine unlearning via measure-theoretic alignment of retrained and unlearned model distributions, then provestransformer-specific hardness results: exact unlearning is coNP-hard, while ap-proximate unlearning requires Ω(T 1−o(1)) time under the Exponential Time Hy-pothesis (ETH). We construct an optimal Recursive Sketch-and-Freeze pro-tocol achieving these bounds through differential privacy duality and Kronecker-product sketching. Crucially, we identify phase transitions in R´enyi unlearningcost at critical model scales (n ≈ d log k). These results provide (1) theoreticalbenchmarks for evaluating unlearning algorithms, (2) complexity-aware guide-lines for AI regulation, and (3) mathematically grounded verification tools forGDPR/CPRA compliance.