Skip to yearly menu bar Skip to main content


Stochastic Modified Equations and Dynamics of Dropout Algorithm

Zhongwang Zhang · Yuqing Li · Tao Luo · Zhiqin Xu

Halle B #278
[ ]
Tue 7 May 7:30 a.m. PDT — 9:30 a.m. PDT


Dropout is a widely utilized regularization technique in the training of neural networks, nevertheless, its underlying mechanism and impact on achieving good generalization abilities remain to be further understood. In this work, we start by undertaking a rigorous theoretical derivation of the stochastic modified equations, with the primary aim of providing an effective approximation for the discrete iterative process of dropout. Meanwhile, we experimentally verify SDE's ability to approximate dropout under a wider range of settings. Subsequently, we empirically delve into the intricate mechanisms by which dropout facilitates the identification of flatter minima. This exploration is conducted through intuitive approximations, exploiting the structural analogies inherent in the Hessian of loss landscape and the covariance of dropout. Our empirical findings substantiate the ubiquitous presence of the Hessian-variance alignment relation throughout the training process of dropout.

Chat is not available.