ICLR Poster Learning Continually by Spectral Regularization

Poster

Learning Continually by Spectral Regularization

Alex Lewandowski · Michał Bortkiewicz · Saurabh Kumar · Andras Gyorgy · Dale Schuurmans · Mateusz Ostaszewski · Marlos C. Machado

Hall 3 + Hall 2B #330

[ Abstract ]

Wed 23 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Loss of plasticity is a phenomenon where neural networks can become more difficult to train over the course of learning. Continual learning algorithms seek to mitigate this effect by sustaining good performance while maintaining network trainability. We develop a new technique for improving continual learning inspired by the observation that the singular values of the neural network parameters at initialization are an important factor for trainability during early phases of learning. From this perspective, we derive a new spectral regularizer for continual learning that better sustains these beneficial initialization properties throughout training. In particular, the regularizer keeps the maximum singular value of each layer close to one. Spectral regularization directly ensures that gradient diversity is maintained throughout training, which promotes continual trainability, while minimally interfering with performance in a single task. We present an experimental analysis that shows how the proposed spectral regularizer can sustain trainability and performance across a range of model architectures in continual supervised and reinforcement learning settings. Spectral regularization is less sensitive to hyperparameters while demonstrating better training in individual tasks, sustaining trainability as new tasks arrive, and achieving better generalization performance..

Live content is unavailable. Log in and register to view live content