Poster
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy

Improved Techniques for Training Smaller and Faster Stable Diffusion

Hesong Wang · Huan Wang

Keywords: Network Pruning Stable Diffusion Step Distillation

Project Page [ OpenReview]

Abstract

Recent SoTA text-to-image diffusion models achieve impressive generation quality but their computational cost has been prohibitively large. Network pruning and step distillation are two widely-used compression techniques to reduce the model size and inference steps. This work presents a few improved techniques in these aspects to train smaller and faster diffusion models with a cheap training cost. Specifically, compared to the prior SoTA counterparts, we introduce a structured pruning method to remove insignificant weight blocks based an improved performance sensitivity. To regain performance after pruning, a CFG-aware retraining loss is proposed, which is shown critical to performance. Finally, a modified CFG-aware step distillation is used to reduce the steps. Empirically, our method manages to prune the U-Net parameters of SD v2.1 base by 46\%, inference steps reduced from 25 to 8, achieving an overall $3.0\times$ wall-clock inference speedup. Our 8-step model is significantly better than 25-step BK-SDM, the prior SoTA for cheap Stable Diffusion, while being even smaller.

Chat is not available.