Poster
SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
Minjun Kim · Jongjin Kim · U Kang
Hall 3 + Hall 2B #77
How can we accurately quantize a pre-trained model without any data?Quantization algorithms are widely used for deploying neural networks on resource-constrained edge devices.Zero-shot Quantization (ZSQ) addresses the crucial and practical scenario where training data are inaccessible for privacy or security reasons.However, three significant challenges hinder the performance of existing ZSQ methods: 1) noise in the synthetic dataset, 2) predictions based on off-target patterns, and the 3) misguidance by erroneous hard labels.In this paper, we propose SynQ (Synthesis-aware Fine-tuning for Zero-shot Quantization),a carefully designed ZSQ framework to overcome the limitations of existing methods.SynQ minimizes the noise from the generated samples by exploiting a low-pass filter.Then, SynQ trains the quantized model to improve accuracy by aligning its class activation map with the pre-trained model.Furthermore, SynQ mitigates misguidance from the pre-trained model's error by leveraging only soft labels for difficult samples.Extensive experiments show that SynQ provides the state-of-the-art accuracy, over existing ZSQ methods.
Live content is unavailable. Log in and register to view live content