Contributed Talk
in
Workshop: Scientific Methods for Understanding Deep Learning (Sci4DL) Sun, Apr 26, 2026 • 7:15 AM – 7:30 AM PDT

Contributed Talk - Less Data, Faster Training: sampling bias from small dataset can speed up training

Jingwen Liu

Project Page

Abstract

This work investigates the "small-vs-large gap", where training on fewer samples can lead to compute saving compared to using a larger dataset. This is observed across algorithmic tasks, architectures and optimizers and cannot be explained using prior theory. We argue that the speedup comes from appropriate layer-wise norm growth enabled by sampling biases, which is more pronounced when the dataset size is smaller. We provide both theoretical analysis and empirical evidence from various interventions. Together, our results highlight the underexplored potential of jointly considering different resources.

Video

Chat is not available.