When Does Diffusion Help? PDE-Inspired Optimization on Fragmented and Noisy Data
Abstract
Diffusion-inspired regularization, motivated by partial differential equations and smooth gradient flow, is increasingly used to stabilize neural network optimization. However, it remains unclear when such mechanisms materially affect learning, particularly under irregular data geometry and noisy supervision. We study diffusion-based optimization through a controlled experimental framework built on synthetic benchmarks that explicitly isolate geometric and statistical factors. We introduce a diffusion-regularized variant of stochastic gradient descent inspired by parabolic PDE smoothing and evaluate it on datasets exhibiting highly curved decision boundaries, disconnected supports, and varying levels of label noise. Across experiments, we analyze optimization dynamics, noise robustness, and loss landscape geometry under strictly matched training conditions. We find that diffusion regularization consistently smooths gradient flow and modifies local loss geometry, yielding stable convergence in fragmented regimes. However, improvements in predictive accuracy are strongly task-dependent and often dominated by dataset structure rather than optimizer choice. These results clarify when PDE-inspired diffusion meaningfully shapes optimization geometry, while highlighting its limitations as a general-purpose mechanism for improving performance.