ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

David Berthelot; Nicholas Carlini; Ekin D. Cubuk; Alex Kurakin; Kihyuk Sohn; Han Zhang; Colin Raffel

Abstract: We improve the recently-proposed ``MixMatch semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. - Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels. - Augmentation anchoring} feeds multiple strongly augmented versions of an input into the model and encourages each output to be close to the prediction for a weakly-augmented version of the same input. To produce strong augmentations, we propose a variant of AutoAugment which learns the augmentation policy while the model is being trained. Our new algorithm, dubbed ReMixMatch, is significantly more data-efficient than prior work, requiring between 5 times and 16 times less data to reach the same accuracy. For example, on CIFAR-10 with 250 labeled examples we reach 93.73% accuracy (compared to MixMatch's accuracy of 93.58% with 4000 examples) and a median accuracy of 84.92% with just four labels per class.

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

Similar Papers

DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Junnan Li, Richard Socher, Steven C.H. Hoi,

Adversarial AutoAugment

Xinyu Zhang, Qiang Wang, Jian Zhang, Zhao Zhong,

Consistency Regularization for Generative Adversarial Networks

Han Zhang, Zizhao Zhang, Augustus Odena, Honglak Lee,