ICLR On the Diminishing Returns of Width for Continual Learning

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

On the Diminishing Returns of Width for Continual Learning

Etash Guha · Vihan Lakshman

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

While deep neural networks have demonstrated groundbreaking performance invarious settings, these models often suffer from catastrophic forgetting whentrained on new tasks in sequence. Several works have empirically demonstratedthat increasing the width of a neural network leads to a decrease in catastrophicforgetting but have yet to characterize the exact relationship between width andcontinual learning. We design one of the first frameworks to analyze ContinualLearning Theory and prove that width is directly related to forgetting in Feed-Forward Networks (FFN), demonstrating that the diminishing returns of increasingwidths to reduce forgetting. We empirically verify our claims at widths hithertounexplored in prior studies where the diminishing returns are clearly observed aspredicted by our theory.

Chat is not available.

Poster in Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

On the Diminishing Returns of Width for Continual Learning

Etash Guha · Vihan Lakshman

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning