The Role of Data in Model Merging
Gaurav Iyer ⋅ Ekaterina Lobacheva
Abstract
Model merging procedures often include components that are data-dependent, but the effect of data is often overlooked. Focusing on two key components of the merging process -- the computation of permutation symmetries and the correction of activation statistics, we study how the amount and difficulty of data affects model merging. Our experiments show that choice of data significantly influences merged model performance, with suboptimal choices resulting in up to $2\times$ worse performance than the ideal. We also demonstrate that data affects merged model performance primarily through the correction of activation statistics and that skewed data subsets consistently lead to incorrect estimates of these statistics.
Chat is not available.
Successful Page Load