Poster
in
Workshop: Agentic AI in the Wild: From Hallucinations to Reliable Autonomy

Efficient Hallucination Detection for LLMs Using Uncertainty-Aware Attention Heads

Artem Vazhentsev ⋅ Lyudmila Rvanova ⋅ Gleb Kuzmin ⋅ Ekaterina Fadeeva ⋅ Ivan Lazichny ⋅ Alexander Panchenko ⋅ Maxim Panov ⋅ Mrinmaya Sachan ⋅ Preslav Nakov ⋅ Timothy Baldwin ⋅ Artem Shelmanov

Project Page [ OpenReview]

Abstract

While large language models (LLMs) have become highly capable, they remain prone to factual inaccuracies, commonly referred to as ``hallucinations.'' Uncertainty quantification (UQ) offers a promising way to mitigate this issue, but most existing methods are computationally intensive and/or require supervision. In this work, we propose Recurrent Attention-based Uncertainty Quantification (RAUQ), an unsupervised and efficient framework for identifying hallucinations. The method leverages an observation about transformer attention behavior: when incorrect information is generated, certain ``uncertainty-aware'' attention heads tend to reduce their focus on preceding tokens. RAUQ automatically detects these attention heads and combines their activation patterns with token-level confidence measures in a recurrent scheme, producing a sequence-level uncertainty estimate in just a single forward pass. Through experiments on twelve tasks spanning question answering, summarization, and translation across four different LLMs, we show that RAUQ consistently outperforms state-of-the-art UQ baselines. Importantly, it incurs minimal overhead, requiring less than 1\% additional computation. Since it requires neither labeled data nor extensive parameter tuning, RAUQ serves as a lightweight, plug-and-play solution for real-time hallucination detection in white-box LLMs.

Chat is not available.