Composition of Pretrained Diffusion Models: A Logic-Based Calculus
Abstract
Composing pretrained diffusion models provides a cost-effective mechanism to encode constraints and unlock complex generative capabilities. Prior work relies on crafting compositional operators that seek to extend set-theoretic notions such as union and intersection to diffusion models, e.g., using a product or mixture of the underlying energy functions. We expose the inadequacy and inconsistency of combining these operators in terms of limited mode coverage, biased sampling, instability under negation queries, and failure to satisfy basic compositional laws such as idempotency and distributivity. We introduce a principled calculus grounded in fuzzy logic that resolves these issues. Specifically, we define a general class of conjunction, disjunction, and negation operators that generalize the classical mixtures, illustrating how they circumvent various pathologies and enable precise combinatorial reasoning with score models. Beyond existing methods, the proposed Dombi operators yield complex generative outcomes, such as the Exclusive-OR (XOR) of individual scores. We establish rigorous theoretical guarantees on the stability and temperature scaling of Dombi compositions, and derive Feynman-Kac correctors to mitigate the sampling bias in score composition. Empirical results on image generation with stable diffusion and multi-objective molecular generation substantiate the conceptual, theoretical, and methodological benefits. Overall, this work lays the foundation for systematic design, analysis, and deployment of diffusion ensembles. Code is available at https://github.com/Aalto-QuML/logic-diffusion-composition