Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 5th Workshop on practical ML for limited/low resource settings (PML4LRS) @ ICLR 2024

PC-LoRA: Progressive Model Compression with Low Rank Adaptation

Injoon Hwang · HaeWon Park · Jooyoung Yang · SunJae Maeng · Youngwan Lee


Abstract:

This work presents Progressive Compression LoRA (PC-LoRA), a novel extension of Low-Rank Adaptation (LoRA), designed to enable model compression and parameter-efficient fine-tuning. To mitigate the computational costs of large-scale models, PC-LoRA introduces a approach of decaying model weights to zero. This method allows to model compression and efficient fine-tuning by progressively reducing the pre-trained weights during the fine-tuning process until they are completely removed. Through empirical analysis on various models, wedemonstrate that PC-LoRA significantly reduces computational costs with minor performance degradation. Compared to full fine-tuning and LoRA fine-tuning, PC-LoRA shows an average performance drop of -3.085%. Despite this, our method substantially compresses models, by 94.1% / 89.1% in parameters and FLOPs for vision models, and achieves a 93.5% parameter and 84.2% Flops reduction for NLP models.

Chat is not available.