Invited Talk
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy
Jerry Li
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy
Diffusion models suffer from a major performance bottleneck in practice: their inference process is extremely computationally expensive and time-consuming. State-of-the-art models require evaluating a large neural network, often hundreds or thousands of times, to generate a single output, creating substantial inference costs that limit significantly real-world deployment. A major research question is how to optimize the performance of a fixed diffusion model with significantly fewer neural function evaluations (NFEs). In this work, we propose a new method that learns a good solver for the DM, which we call Solving for the Solver (S4S). S4S directly optimizes a solver to obtain good generation quality by learning to match the output of a strong teacher solver. We evaluate S4S on six different pre-trained DMs, including pixel-space and latent-space DMs for both conditional and unconditional sampling. In all settings, S4S uniformly improves the sample quality relative to traditional ODE solvers. Moreover, our method is lightweight, data-free, and can be plugged in black-box on top of any discretization schedule or architecture to improve performance. Building on top of this, we also propose S4S-Alt, which optimizes both the solver and the discretization schedule. By exploiting the full design space of DM solvers, with 5 NFEs, we achieve an FID of 3.73 on CIFAR10 and 13.26 on MS-COCO, representing a 1.5× improvement over previous training-free ODE methods.