Beyond Uniformity: Sample and Frequency Meta Weighting for Post-Training Quantization of Diffusion Models
Abstract
Post-training quantization (PTQ) is an attractive approach for compressing diffusion models to speed up the sampling process and reduce memory footprint. Most existing PTQ methods uniformly sample data from various time steps in denoising process to construct a calibration set for quantization and consider calibration samples equally important during the quantization process. However, treating all calibration samples equally may not be optimal. One notable property in the denoising process of diffusion models is that low-frequency features are primarily recovered in early stages, while high-frequency features are recovered in later stages of the denoising process. However, none of the previous works on quantization for diffusion models consider this property to enhance the effectiveness of quantized models. In this paper, we propose a novel meta-learning approach for PTQ of diffusion models that jointly optimizes the contributions of calibration samples and the weighting of frequency components at each time step for quantizing noise estimation networks. Specifically, our approach automatically learns to assign optimal weights to calibration samples while selectively focusing on mimicking specific frequency components of data generated by the full-precision noise estimation network at each denoising time step. Extensive experiments on CIFAR-10, LSUN-Bedrooms, FFHQ, and ImageNet datasets demonstrate that our approach consistently outperforms the compared PTQ methods for diffusion models.