ICLR Poster Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification

Poster

Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification

Mehdi Naouar · Hanne Raum · Jens Rahnfeld · Yannick Vogt · Joschka Boedecker · Gabriel Kalweit · Maria Kalweit

Hall 3 + Hall 2B #524

[ Abstract ]

Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

The integration of deep learning into critical vision application areas has given rise to a necessity for techniques that can explain the rationale behind predictions. In this paper, we address this need by introducing Salvage, a novel removal-based explainability method for image classification. Our approach involves training an explainer model that learns the prediction distribution of the classifier on masked images. We first introduce the concept of Shapley-distributions, which offers a more accurate approximation of classification probability distributions than existing methods. Furthermore, we address the issue of unbalanced important and unimportant features. In such settings, naive uniform sampling of feature subsets often results in a highly unbalanced ratio of samples with high and low prediction likelihoods, which can hinder effective learning. To mitigate this, we propose an informed sampling strategy that leverages approximated feature importance scores, thereby reducing imbalance and facilitating the estimation of underrepresented features. After incorporating these two principles into our method, we conducted an extensive analysis on the ImageNette, MURA, WBC, and Pet datasets. The results show that Salvage outperforms various baseline explainability methods, including attention-, gradient-, and removal-based approaches, both qualitatively and quantitatively. Furthermore, we demonstrate that our explainer model can serve as a fully explainable classifier without a major decrease in classification performance, paving the way for fully explainable image classification.

Live content is unavailable. Log in and register to view live content