Poster
in
Workshop: XAI4Science: From Understanding Model Behavior to Discovering New Scientific Knowledge

Bayesian Concept Bottleneck Models with LLM Priors

Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan

Project Page [ OpenReview]

Abstract

Concept Bottleneck Models (CBMs) have been proposed as a compromise between white-box and black-box models, aiming to achieve interpretability without sacrificing accuracy.The standard training procedure for CBMs is to predefine a candidate set of human-interpretable concepts, extract their values from the training data, and identify a sparse subset as inputs to a transparent prediction model.However, such approaches are often hampered by the tradeoff between exploring a sufficiently large set of concepts to include those that are truly relevant versus controlling the cost of obtaining concept extractions.This work investigates a novel approach that sidesteps these challenges: BC-LLM iteratively searches over a potentially infinite set of concepts within a Bayesian framework, in which Large Language Models (LLMs) serve as both a concept extraction mechanism and prior.Even though LLMs can be miscalibrated and hallucinate, we prove that BC-LLM can provide rigorous statistical inference and uncertainty quantification.Across image, text, and tabular datasets, BC-LLM outperforms comparator methods including black-box models, converges more rapidly towards relevant concepts and away from spuriously correlated ones, and is more robust to out-of-distribution samples.

Chat is not available.