Oral
Oral 7C
Halle A 2
Moderator: Jiaqi Ma
Less is More: Fewer Interpretable Region via Submodular Subset Selection
Ruoyu Chen · Hua Zhang · Siyuan Liang · Jingzhi Li · Xiaochun Cao
Image attribution algorithms aim to identify important regions that are highly relevant to model decisions. Although existing attribution solutions can effectively assign importance to target elements, they still face the following challenges: 1) existing attribution methods generate inaccurate small regions thus misleading the direction of correct attribution, and 2) the model cannot produce good attribution results for samples with wrong predictions. To address the above challenges, this paper re-models the above image attribution problem as a submodular subset selection problem, aiming to enhance model interpretability using fewer regions. To address the lack of attention to local regions, we construct a novel submodular function to discover more accurate small interpretation regions. To enhance the attribution effect for all samples, we also impose four different constraints on the selection of sub-regions, i.e., confidence, effectiveness, consistency, and collaboration scores, to assess the importance of various subsets. Moreover, our theoretical analysis substantiates that the proposed function is in fact submodular. Extensive experiments show that the proposed method outperforms SOTA methods on two face datasets (Celeb-A and VGG-Face2) and one fine-grained dataset (CUB-200-2011). For correctly predicted samples, the proposed method improves the Deletion and Insertion scores with an average of 4.9\% and 2.5\% gain relative to HSIC-Attribution. For incorrectly predicted samples, our method achieves gains of 81.0\% and 18.4\% compared to the HSIC-Attribution algorithm in the average highest confidence and Insertion score respectively. The code is released at https://github.com/RuoyuChen10/SMDL-Attribution.
On the Joint Interaction of Models, Data, and Features
Yiding Jiang · Christina Baek · J Kolter
Learning features from data is one of the defining characteristics of deep learning,but the theoretical understanding of the role features play in deep learning is still inearly development. To address this gap, we introduce a new tool, the interactiontensor, for empirically analyzing the interaction between data and model throughfeatures. With the interaction tensor, we make several key observations abouthow features are distributed in data and how models with different random seedslearn different features. Based on these observations, we propose a conceptualframework for feature learning. Under this framework, the expected accuracy for asingle hypothesis and agreement for a pair of hypotheses can both be derived inclosed form. We demonstrate that the proposed framework can explain empiricallyobserved phenomena, including the recently discovered Generalization Disagreement Equality (GDE) that allows for estimating the generalization error with onlyunlabeled data. Further, our theory also provides explicit construction of naturaldata distributions that break the GDE. Thus, we believe this work provides valuablenew insight into our understanding of feature learning.