Skip to yearly menu bar Skip to main content


Poster

Geometry of Long-Tailed Representation Learning: Rebalancing Features for Skewed Distributions

Lingjie Yi · Michael Yao · Weimin Lyu · Haibin Ling · Raphael Douady · Chao Chen

Hall 3 + Hall 2B #442
[ ]
Thu 24 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

Deep learning has achieved significant success on balanced datasets. However, real-world data often exhibit a long-tailed distribution. Empirical results reveal that long-tailed data skew representations, where head classes dominate the feature space. Many methods have been proposed to empirically rectify the skewed representations. However, a clear understanding of the underlying cause and extent of this skew remains lacking. In this study, we provide a comprehensive theoretical analysis to elucidate how long-tailed data affect feature distributions, deriving the conditions under which centers of tail classes shrink together or even collapse into a single point. This results in overlapping feature distributions of tail classes, making features in the overlapping regions inseparable. Moreover, we demonstrate that merely empirically correcting the skewed representations of the training data is insufficient to separate the overlapping features due to distribution shifts between the training and real data. To address these challenges, we propose a novel long-tailed representation learning method, FeatRecon. It reconstructs the feature space to arrange features from different classes into symmetrical and linearly separable regions. This, in turn, enhances the model’s robustness to long-tailed data. We validate the effectiveness of our method through extensive experiments on the CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist 2018 datasets.

Live content is unavailable. Log in and register to view live content