Skip to yearly menu bar Skip to main content


Poster

The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws

Tian Jin ⋅ Ahmed Imtiaz Humayun ⋅ Utku Evci ⋅ Suvinay Subramanian ⋅ Amir Yazdanbakhsh ⋅ Dan Alistarh ⋅ Gintare Karolina Dziugaite
2025 Poster

Abstract

Video

Chat is not available.