When Guidance Breaks: A Schrödinger Bridge Perspective on Inference-Time Alignment in Diffusion Models
Mahule Roy ⋅ Subhas Roy
Abstract
Inference-time guidance aligns diffusion models with downstream constraints without retraining, yet excessive guidance induces mode collapse, reduced diversity, and instability. We provide a theoretical account through Schrödinger bridge (SB) theory. Viewing diffusion sampling as entropy-regularized optimal transport, we show that guidance corresponds to exponential tilting of the terminal marginal. As the guidance scale increases, the associated optimal control energy grows rapidly, leading to ill-conditioned bridge dynamics under finite diffusion noise and discrete solvers. Motivated by the SB dual formulation, we propose a training-free adaptive guidance scheme that normalizes guidance by local gradient magnitude, stabilizing inference. Experiments on 2D mixtures and CIFAR-10 demonstrate that adaptive guidance preserves diversity (LPIPS $0.56$ vs.\ $0.28$ for fixed high guidance) while maintaining strong alignment. Results validate both the theoretical mechanism and practical benefit.
Chat is not available.
Successful Page Load