Skip to yearly menu bar Skip to main content


Poster

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Shane Bergsma ⋅ Nolan Dey ⋅ Gurpreet Gosal ⋅ Gavia Gray ⋅ Daria Soboleva ⋅ Joel Hestness
2025 Poster

Abstract

Video

Chat is not available.