Poster
MCNC: Manifold-Constrained Reparameterization for Neural Compression
Chayne Thrash · Reed Andreas · Ali Abbasi · Parsa Nooralinejad · Soroush Abbasi Koohpayegani · Hamed Pirsiavash · Soheil Kolouri
Hall 3 + Hall 2B #352
The outstanding performance of large foundational models across diverse tasks,from computer vision to speech and natural language processing, has significantlyincreased their demand. However, storing and transmitting these models posessignificant challenges due to their massive size (e.g., 750GB for Llama 3.1 405B).Recent literature has focused on compressing the original weights or reducing thenumber of parameters required for fine-tuning these models. These compressionmethods generally constrain the parameter space, for example, through low-rankreparametrization (e.g., LoRA), pruning, or quantization (e.g., QLoRA) duringor after the model training. In this paper, we present a novel model compres-sion method, which we term Manifold-Constrained Neural Compression (MCNC).This method constrains the parameter space to low-dimensional pre-defined andfrozen nonlinear manifolds, which effectively cover this space. Given the preva-lence of good solutions in over-parameterized deep neural networks, we show thatby constraining the parameter space to our proposed manifold, we can identifyhigh-quality solutions while achieving unprecedented compression rates acrossa wide variety of tasks and architectures. Through extensive experiments incomputer vision and natural language processing tasks, we demonstrate that ourmethod significantly outperforms state-of-the-art baselines in terms of compres-sion, accuracy, and/or model reconstruction time. Our code is publicly available athttps://github.com/mint-vu/MCNC.
Live content is unavailable. Log in and register to view live content