Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay

Leyan Pan · Xinyuan Cao


Abstract:

Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks. NC implies that the within-class variability of last-layer features tends to zero, and their class means converge to a simplex equiangular tight frame (vectors equally spaced apart in terms of angle).In this paper, we demonstrate that batch normalization (BN) and weight decay (WD) critically influence NC's occurrence. We establish theoretical guarantees for the emergence of NC for a deep neural network with last-layer BN and WD in scenarios where the regularized cross-entropy loss is near-optimal. Our experiments substantiate insights from our theories and further underscore the significant role of BN and WD in the emergence of NC. Our findings offer a novel perspective in comprehending the mechanisms of BN and WD, particularly in their role as norm regularizers for feature and weight matrices.

Chat is not available.