SumRA: Parameter Efficient Fine-tuning with Singular Value Decomposition and Summed Orthogonal Basis
Chin Yuen Kwok · Yongsen Zheng · Jia Qi Yip · Kwok Yan Lam · Ensiong Chng
Abstract
Parameter-efficient fine-tuning (PEFT) aims to adapt large pretrained speech models using fewer trainable parameters while maintaining performance. Low-Rank Adaptation (LoRA) achieves this by decomposing weight updates into two low-rank matrices, $A$ and $B$, such that $W'=W_0+BA$. Previous studies showed that freezing $A$ and only updating $B$ can reduce trainable parameters and achieve performance close to standard LoRA, where $A$ is initialized using the principal singular vectors of $W_0$ obtained via singular value decomposition (SVD). However, because $A$ is typically initialized with only the leading singular vectors, its representation capacity is confined to a narrow subspace of the model’s knowledge. To overcome this limitation, we propose SumRA, which initializes each row of $A$ as a sum of multiple singular vectors chosen from beyond the leading components, thereby enabling $A$ to influence a larger portion of the model’s knowledge space. Experiments on multilingual automatic speech recognition (ASR) tasks show that by adapting Whisper to five new languages from Common Voice with only 10 hours of data each, our method improves word error rate from 14.42\% to 12.41\% over LoRA baselines while using 50\% less trainable parameters.
Successful Page Load