Towards Stable and Deterministic Diffusion via Reverse Transition Kernels
Abstract
Diffusion models have become a leading paradigm for generative modeling across both visual and scientific domains; however, their widespread deployment is fundamentally constrained by the high computational cost of sampling, which typically requires long stochastic trajectories and sequential updates. This bottleneck is particularly pronounced in high-dimensional settings such as molecular generation and image synthesis, where inference latency scales poorly with both dimensionality and the number of diffusion steps. Our work provides a principled foundation for overcoming this limitation by showing that deterministic diffusion models, when viewed through the Reverse Transition Kernel (RTK) framework (\cite{huang2024reversetransitionkernelflexible}), induce structured and well-conditioned reverse-time subproblems. This perspective reveals that the sampling process can be reformulated as strongly log-concave optimization problems, enabling stable and efficient updates with constant step sizes. Beyond theoretical implications, this reinterpretation provides a unifying lens for understanding deterministic and stochastic diffusion, while offering a practical pathway toward fast, stable, and scalable generative modeling. On molecular benchmarks (GEOM-DRUGS), our method achieves faster convergence and improved structural fidelity while preserving chemical validity. On image generation tasks, we observe consistent gains in stability and sample quality, supporting the generality of the framework. We further provide empirical validation of our theoretical analysis by measuring key regularity properties of the learned denoiser. These findings suggest a promising pathway toward scalable and efficient sampling of high-dimensional data.