Skip to yearly menu bar Skip to main content


Poster

How Learning Rate Decay Wastes Your Best Data in Curriculum-Based LLM Pretraining

Kairong Luo · Zhenbo Sun · Haodong Wen · Xinyu Shi · Jiarui Cui · Chenyi Dang · Kaifeng Lyu · Wenguang Chen

Abstract

Log in and register to view live content