In-Person Poster presentation / poster accept

On The Inadequacy of Optimizing Alignment and Uniformity in Contrastive Learning of Sentence Representations

Zhijie Nie · Richong Zhang · Yongyi Mao

MH1-2-3-4 #91

Keywords: [ Deep Learning and representational learning ] [ Uniformity ] [ Sentence representation learning ] [ contrastive learning ] [ alignment ]


Contrastive learning is widely used in areas such as visual representation learning (VRL) and sentence representation learning (SRL). Considering the differences between VRL and SRL in terms of negative sample size and evaluation focus, we believe that the solid findings obtained in VRL may not be entirely carried over to SRL. In this work, we consider the suitability of the decoupled form of contrastive loss, i.e., alignment and uniformity, in SRL. We find a performance gap between sentence representations obtained by jointly optimizing alignment and uniformity on the STS task and those obtained using contrastive loss. Further, we find that the joint optimization of alignment and uniformity during training is prone to overfitting, which does not occur on the contrastive loss. Analyzing them based on the variation of the gradient norms, we find that there is a property of ``gradient dissipation'' in contrastive loss and believe that it is the key to preventing overfitting. We simulate similar "gradient dissipation" of contrastive loss on four optimization objectives of two forms, and achieve the same or even better performance than contrastive loss on the STS tasks, confirming our hypothesis.

Chat is not available.