Abstract: It has been widely recognized that adversarial examples can be easily crafted to fool deep networks, which mainly root from the locally non-linear behavior nearby input examples. Applying mixup in training provides an effective mechanism to improve generalization performance and model robustness against adversarial perturbations, which introduces the globally linear behavior in-between training examples. However, in previous work, the mixup-trained models only passively defend adversarial attacks in inference by directly classifying the inputs, where the induced global linearity is not well exploited. Namely, since the locality of the adversarial perturbations, it would be more efficient to actively break the locality via the globality of the model predictions. Inspired by simple geometric intuition, we develop an inference principle, named mixup inference (MI), for mixup-trained models. MI mixups the input with other random clean samples, which can shrink and transfer the equivalent perturbation if the input is adversarial. Our experiments on CIFAR-10 and CIFAR-100 demonstrate that MI can further improve the adversarial robustness for the models trained by mixup and its variants.

Similar Papers

Improving Adversarial Robustness Requires Revisiting Misclassified Examples
Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, Quanquan Gu,
Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks
Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, John E. Hopcroft,
Robust Local Features for Improving the Generalization of Adversarial Training
Chuanbiao Song, Kun He, Jiadong Lin, Liwei Wang, John E. Hopcroft,