Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning
On the Diminishing Returns of Width for Continual Learning
Etash Guha · Vihan Lakshman
While deep neural networks have demonstrated groundbreaking performance invarious settings, these models often suffer from catastrophic forgetting whentrained on new tasks in sequence. Several works have empirically demonstratedthat increasing the width of a neural network leads to a decrease in catastrophicforgetting but have yet to characterize the exact relationship between width andcontinual learning. We design one of the first frameworks to analyze ContinualLearning Theory and prove that width is directly related to forgetting in Feed-Forward Networks (FFN), demonstrating that the diminishing returns of increasingwidths to reduce forgetting. We empirically verify our claims at widths hithertounexplored in prior studies where the diminishing returns are clearly observed aspredicted by our theory.