Oral
in
Affinity Event: Tiny Papers Showcase Day (a DEI initiative)

Prune and Tune: Improving Efficient Pruning Techniques for Massive Language Models

Aaquib Syed · Phillip Guo

2023 Oral
in
Affinity Event: Tiny Papers Showcase Day (a DEI initiative)

Project Page

Abstract

Massive language models with billions of parameters have significant compute expenses and thus can benefit from pruning. Pruning techniques for massive models are typically iterative and require extensive weight retraining after pruning. SparseGPT, a recently introduced one-shot technique for pruning such models, enables pruning without retraining. We improve upon SparseGPT by fine-tuning during pruning with minimal training steps, and we perform experiments against magnitude pruning and find that our iteratively fine-tuned SparseGPT models significantly outperform their magnitude pruning counterparts at high sparsity.

Video

Chat is not available.