ICLR Poster An Adaptive Policy to Employ Sharpness-Aware Minimization

In-Person Poster presentation / poster accept

An Adaptive Policy to Employ Sharpness-Aware Minimization

Weisen JIANG · Hansi Yang · Yu Zhang · James Kwok

MH1-2-3-4 #125

Keywords: [ Optimization ] [ model generalization ] [ loss landscape ] [ Sharpness-aware minimization ]

[ Abstract ]

[ Poster] [ OpenReview]

Abstract:

Sharpness-aware minimization (SAM), which searches for flat minima by min-max optimization, has been shown to be useful in improving model generalization. However, since each SAM update requires computing two gradients, its computational cost and training time are both doubled compared to standard empirical risk minimization (ERM). Recent state-of-the-arts reduce the fraction of SAM updates and thus accelerate SAM by switching between SAM and ERM updates randomly or periodically. In this paper, we design an adaptive policy to employ SAM based on the loss landscape geometry. Two efficient algorithms, AE-SAM and AE-LookSAM, are proposed. We theoretically show that AE-SAM has the same convergence rate as SAM. Experimental results on various datasets and architectures demonstrate the efficiency and effectiveness of the adaptive policy.

Chat is not available.