Skip to yearly menu bar Skip to main content


Poster

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

Nikolaos Tsilivis · Gal Vardi · Julia Kempe

Hall 3 + Hall 2B #441
[ ]
Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

We study the implicit bias of the family of steepest descent algorithms with infinitesimal learning rate, including gradient descent, sign gradient descent and coordinate descent, in deep homogeneous neural networks. We prove that an algorithm-dependent geometric margin increases during training and characterize the late-stage bias of the algorithms. In particular, we define a generalized notion of stationarity for optimization problems and show that the algorithms progressively reduce a (generalized) Bregman divergence, which quantifies proximity to such stationary points of a margin-maximization problem. We then experimentally zoom into the trajectories of neural networks optimized with various steepest descent algorithms, highlighting connections to the implicit bias of Adam.

Live content is unavailable. Log in and register to view live content