Expert-Data Alignment Governs Generation Quality in Decentralized Diffusion Models
Abstract
Decentralized Diffusion Models (DDMs) route denoising through experts trained independently on disjoint data clusters, which can strongly disagree in their predictions. What governs the quality of generations in such systems? We present the first systematic investigation of this question. A priori, one might expect that minimizing denoising trajectory sensitivity should govern generation quality. We demonstrate this hypothesis is incorrect: full ensemble routing achieves the most stable sampling dynamics while producing the worst generation quality (FID 47.9 vs.\ 22.6 for sparse Top-2 routing). Instead, we identify \emph{expert-data alignment} as the governing principle: generation quality depends on routing inputs to experts whose training distribution covers the current denoising state. Across two DDM systems, we validate this principle using cluster distance analysis, per-expert prediction quality, and expert disagreement analysis. For DDM deployment, our findings establish that routing should prioritize expert-data alignment over numerical stability metrics.