Poster
in
Workshop: Quantify Uncertainty and Hallucination in Foundation Models: The Next Frontier in Reliable AI

Uncertainty Quantification for MLLMs

Gregory Kang Ruey Lau · Hieu Dao · Bryan Kian Hsiang Low

Keywords: MLLM Uncertainty quantification

Project Page [ OpenReview]

Abstract

Multimodal Large Language Models (MLLMs) hold promise in tackling challenging multimodal tasks, but the lack of accurate uncertainty quantification for their responses make it hard to trust them and pose major barriers to practical deployment in real-life settings. As MLLMs may generate seemingly plausible but erroneous output, producing accurate uncertainty metrics quickly for each MLLM response during inference could enable interventions such as escalating queries with uncertain responses to human experts or larger models for improved performance. However, existing uncertainty quantification methods require external verifiers, additional training of models, or relatively high computational resources, and have problems handling challenging scenarios such as out-of-distribution (OOD) or adversarial settings. To overcome these limitations, we present an efficient and effective training-free framework to estimate MLLM output uncertainty at inference time without external tools, by computing metrics based on the diversity of the MLLM's responses that is augmented with internal indicators of each output's coherence. We empirically show that our method significantly outperforms benchmarks in predicting incorrect responses and providing calibrated uncertainty estimates, including for settings such as those with OOD and adversarial data.

Chat is not available.