Poster
in
Workshop: Pitfalls of limited data and computation for Trustworthy ML

Understanding the class-specific effects of data augmentations

Polina Kirichenko ⋅ Randall Balestriero ⋅ Mark Ibrahim ⋅ Shanmukha Ramakrishna Vedantam ⋅ Hamed Firooz ⋅ Andrew Wilson

Project Page [ OpenReview]

Abstract

Data augmentation (DA) is a major part of modern computer vision used to encode invariance and improve generalization. However, recent studies have shown that the effects of DA can be highly class dependent: augmentation strategies that improve average accuracy may significantly hurt the accuracies on a minority of individual classes, e.g. by as much as $20\%$ on ImageNet. In this work, we explain this phenomenon from the perspective of interactions among class-conditional distributions. We find that most affected classes are inherently ambiguous, co-occur, or involve fine-grained distinctions. By using the higher-quality multi-label ImageNet annotations, we show the negative effects of data augmentation on per-class accuracy are significantly less severe.

Video

Chat is not available.