Poster
in
Workshop: Blog Track Poster Session

Decay No More

Fabian Schaipp

Project Page [ OpenReview]

Abstract

Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.

Chat is not available.