Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Pitfalls of limited data and computation for Trustworthy ML

Understanding the class-specific effects of data augmentations

Polina Kirichenko · Randall Balestriero · Mark Ibrahim · Shanmukha Ramakrishna Vedantam · Hamed Firooz · Andrew Wilson


Abstract: Data augmentation (DA) is a major part of modern computer vision used to encode invariance and improve generalization. However, recent studies have shown that the effects of DA can be highly class dependent: augmentation strategies that improve average accuracy may significantly hurt the accuracies on a minority of individual classes, e.g. by as much as $20\%$ on ImageNet. In this work, we explain this phenomenon from the perspective of interactions among class-conditional distributions. We find that most affected classes are inherently ambiguous, co-occur, or involve fine-grained distinctions. By using the higher-quality multi-label ImageNet annotations, we show the negative effects of data augmentation on per-class accuracy are significantly less severe.

Chat is not available.