Skip to yearly menu bar Skip to main content


Mix Early, Forget Less: Data Mixing During Pretraining Builds Resistance to Forgetting

Lawrence Feng ⋅ Gaurav Ghosal ⋅ Jacob Springer ⋅ Ziqian Zhong ⋅ Aditi Raghunathan

Abstract

Chat is not available.