ICLR Poster Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity

Poster

Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity

Wentao Guo · Jikai Long · Yimeng Zeng · Zirui Liu · Xinyu Yang · Yide Ran · Jacob Gardner · Osbert Bastani · Christopher De Sa · Xiaodong Yu · Beidi Chen · Zhaozhuo Xu

Hall 3 + Hall 2B #619

[ Abstract ]

Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes. However, applying ZO fine-tuning in memory-constrained settings such as mobile phones and laptops remains challenging since these settings often involve weight quantization, while ZO requires full-precision perturbation and update. In this study, we address this limitation by combining static sparse ZO fine-tuning with quantization. Our approach transfers a small, static subset (0.1%) of "sensitive" parameters from pre-training to downstream tasks, focusing fine-tuning on this sparse set of parameters. The remaining untuned parameters are quantized, reducing memory demands. Our proposed workflow enables efficient ZO fine-tuning of an Llama2-7B model on a GPU device with less than 8GB of memory while outperforming full model ZO fine-tuning performance and in-context learning.

Live content is unavailable. Log in and register to view live content