Skip to yearly menu bar Skip to main content


Poster

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

Xin Wang · Yu Zheng · Zhongwei Wan · Mi Zhang

Hall 3 + Hall 2B #236
[ ] [ Project Page ]
Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

The advancements in Large Language Models (LLMs) have been hindered bytheir substantial sizes, which necessitates LLM compression methods for practicaldeployment. Singular Value Decomposition (SVD) offers a promising solution forLLM compression. However, state-of-the-art SVD-based LLM compression meth-ods have two key limitations: truncating smaller singular values may lead to highercompression loss, and the lack of update on the compressed weights after SVDtruncation. In this work, we propose SVD-LLM, a SVD-based post-training LLMcompression method that addresses the limitations of existing methods. SVD-LLMincorporates a truncation-aware data whitening technique to ensure a direct map-ping between singular values and compression loss. Moreover, SVD-LLM adoptsa parameter update with sequential low-rank approximation to compensate forthe accuracy degradation after SVD compression. We evaluate SVD-LLM on 10datasets and seven models from three different LLM families at three differentscales. Our results demonstrate the superiority of SVD-LLM over state-of-the-arts,especially at high model compression ratios.

Live content is unavailable. Log in and register to view live content