Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

A resource model for neural scaling laws

Jinyeop Song · Ziming Liu · Max Tegmark · Jeff Gore


Abstract: Neural scaling laws characterize how model performance improves as the model size scales up. Inspired by empirical observations, we introduce a $\textit{resource}$ model of neural scaling: A task is usually composite hence can be decomposed into many subtasks, which "compete" for $\textit{resources}$ (measured by the number of neurons allocated to subtasks). On toy problems, we empirically find that: (1) The loss of a subtask is inversely proportional to its allocated neurons. (2) When multiple subtasks are present in a composite task, the resources acquired by each subtask uniformly grow as models get larger, keeping the ratios of acquired resources constants. We hypothesize these findings to be generally true and build a model to predict neural scaling laws for general composite tasks, which successfully replicates the neural scaling law of $\textit{Chinchilla}$ models reported in (Hoffmann et al., 2022). We believe that the notion of resource used in this paper will be a useful tool for characterizing and diagnosing neural networks.

Chat is not available.