ICLR Poster Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Poster

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Yeoreum Lee · Jinwook Jung · Sungyong Baik

Hall 3 + Hall 2B #491

[ Abstract ] [ Project Page ]

Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model.Recently, several research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters.Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks.Few recent works have focused on designing a new fine-tuning scheme that can lead to small parameter interference, however at the cost of the performance of each task-specific fine-tuned model and thereby limiting that of a merged model.To improve the performance of a merged model, we note that a fine-tuning scheme should aim for (1) smaller parameter interference and (2) better performance of each fine-tuned model on the corresponding task.In this work, we aim to design a new fine-tuning objective function to work towards these two goals.In the course of this process, we find such objective function to be strikingly similar to sharpness-aware minimization (SAM) objective function, which aims to achieve generalization by finding flat minima.Drawing upon our observation, we propose to fine-tune pre-trained models via sharpness-aware minimization.The experimental and theoretical results showcase the effectiveness and orthogonality of our proposed approach, improving performance upon various merging and fine-tuning methods.Our code is available at https://github.com/baiklab/SAFT-Merge.

Live content is unavailable. Log in and register to view live content