Skip to yearly menu bar Skip to main content

Workshop: Reincarnating Reinforcement Learning

On The Role of Forgetting in Fine-Tuning Reinforcement Learning Models

Maciej Wołczyk · Bartłomiej Cupiał · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś


Recently, foundation models have achieved remarkable results in fields such as computer vision and language processing. Although there has been a significant push to introduce similar approaches in reinforcement learning, these have not yet succeeded on a comparable scale. In this paper, we take a step towards understanding and closing this gap by highlighting one of the problems specific to foundation RL models, namely the data shift occurring during fine-tuning. We show that fine-tuning on compositional tasks, where parts of the environment might only be available after a long training period, is inherently prone to catastrophic forgetting. In such a scenario, a pre-trained model might forget useful knowledge before even seeing parts of the state space it can solve. We provide examples of both a grid world and realistic robotic scenarios where catastrophic forgetting occurs. Finally, we show how this problem can be mitigated by using tools from continual learning. We discuss the potential impact of this finding and propose further research directions.

Chat is not available.