ICLR Preserving Principal Subspaces to Reduce Catastrophic Forgetting in Fine-tuning

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Preserving Principal Subspaces to Reduce Catastrophic Forgetting in Fine-tuning

Jörg Franke · Michael Hefenbrock · Frank Hutter

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

In this paper, we address catastrophic forgetting in fine-tuning Large Language Models (LLMs), a process where LLMs lose knowledge and capabilities upon learning new information. Traditional solutions mostly rely on reusing old training data. Such methods are often limited by knowledge about previously used data and possibly limited access to it. In contrast to these approaches, we propose a new strategy focusing on the model's weight matrices. Using Singular Value Decomposition (SVD), we seek to identify and preserve key components within these matrices, particularly the high and low-magnitude directions. Our approach thus uniquely focuses updates on the space spanned by medium-impact directions. This methodology efficiently mitigates catastrophic forgetting and does not require access to the original training data, offering a simpler and more practical solution for LLM fine-tuning applications. We show the benefit of our approach by fine-tuning an LLM and reducing the performance drop on benchmark tasks induced by fine-tuning.

Chat is not available.

Poster in Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Preserving Principal Subspaces to Reduce Catastrophic Forgetting in Fine-tuning

Jörg Franke · Michael Hefenbrock · Frank Hutter

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models