Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning
Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay
Leyan Pan · Xinyuan Cao
Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks. NC implies that the within-class variability of last-layer features tends to zero, and their class means converge to a simplex equiangular tight frame (vectors equally spaced apart in terms of angle).In this paper, we demonstrate that batch normalization (BN) and weight decay (WD) critically influence NC's occurrence. We establish theoretical guarantees for the emergence of NC for a deep neural network with last-layer BN and WD in scenarios where the regularized cross-entropy loss is near-optimal. Our experiments substantiate insights from our theories and further underscore the significant role of BN and WD in the emergence of NC. Our findings offer a novel perspective in comprehending the mechanisms of BN and WD, particularly in their role as norm regularizers for feature and weight matrices.