René Vidal — How Geometry Shapes Optimization in Deep Generative Models
Abstract
Deep generative models have achieved remarkable empirical success, but their theoretical foundations remain poorly understood. This talk presents recent progress on the geometry and optimization of modern generative models, focusing on three representative settings. First, generative model inversion is analyzed with linear convergence of gradient descent established under two geometric conditions on the loss landscape. Second, transformer-based diffusion models trained on multi-token Gaussian mixture data are studied, showing that gradient descent converges to the Bayes-optimal denoiser and that self-attention approximates the optimal MMSE estimator. Third, Parsimonious Flow Matching (PFM) is introduced, which replaces the standard isotropic Gaussian latent with a multimodal mixture aligned with data structure, yielding better-conditioned optimization and faster convergence.