ICLR Poster SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

Poster

SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

Minjun Kim · Jongjin Kim · U Kang

Hall 3 + Hall 2B #77

[ Abstract ] [ Project Page ]

Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

How can we accurately quantize a pre-trained model without any data?Quantization algorithms are widely used for deploying neural networks on resource-constrained edge devices.Zero-shot Quantization (ZSQ) addresses the crucial and practical scenario where training data are inaccessible for privacy or security reasons.However, three significant challenges hinder the performance of existing ZSQ methods: 1) noise in the synthetic dataset, 2) predictions based on off-target patterns, and the 3) misguidance by erroneous hard labels.In this paper, we propose SynQ (Synthesis-aware Fine-tuning for Zero-shot Quantization),a carefully designed ZSQ framework to overcome the limitations of existing methods.SynQ minimizes the noise from the generated samples by exploiting a low-pass filter.Then, SynQ trains the quantized model to improve accuracy by aligning its class activation map with the pre-trained model.Furthermore, SynQ mitigates misguidance from the pre-trained model's error by leveraging only soft labels for difficult samples.Extensive experiments show that SynQ provides the state-of-the-art accuracy, over existing ZSQ methods.

Live content is unavailable. Log in and register to view live content