Poster
Counterfactual Concept Bottleneck Models
Gabriele Dominici · Pietro Barbiero · Francesco Giannini · Martin Gjoreski · Giuseppe Marra · Marc Langheinrich
Hall 3 + Hall 2B #507
Current deep learning models are not designed to simultaneously address three fundamental questions: predict class labels to solve a given classification task (the "What?"), simulate changes in the situation to evaluate how this impacts class predictions (the "How?"), and imagine how the scenario should change to result in different class predictions (the "Why not?"). While current approaches in causal representation learning and concept interpretability are designed to address some of these questions individually (such as Concept Bottleneck Models, which address both what'' and
how'' questions), no current deep learning model is specifically built to answer all of them at the same time. To bridge this gap, we introduce CounterFactual Concept Bottleneck Models (CF-CBMs), a class of models designed to efficiently address the above queries all at once without the need to run post-hoc searches. Our experimental results demonstrate that CF-CBMs: achieve classification accuracy comparable to black-box models and existing CBMs (“What?”), rely on fewer important concepts leading to simpler explanations (“How?”), and produce interpretable, concept-based counterfactuals (“Why not?”). Additionally, we show that training the counterfactual generator jointly with the CBM leads to two key improvements: (i) it alters the model's decision-making process, making the model rely on fewer important concepts (leading to simpler explanations), and (ii) it significantly increases the causal effect of concept interventions on class predictions, making the model more responsive to these changes.
Live content is unavailable. Log in and register to view live content