Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js
Skip to yearly menu bar Skip to main content


Poster

On the Inherent Privacy Properties of Discrete Denoising Diffusion Models

Eli Chien · Pan Li · Vamsi Potluru · Haoteng Yin · Eleonora Kreacic · Haoyu Wang · Rongzhe Wei

Hall 3 + Hall 2B #495
[ ] [ Project Page ]
Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract: Privacy concerns have led to a surge in the creation of synthetic datasets, with diffusion models emerging as a promising avenue. Although prior studies have performed empirical evaluations on these models, there has been a gap in providing a mathematical characterization of their privacy-preserving capabilities. To address this, we present the pioneering theoretical exploration of the privacy preservation inherent in \emph{discrete diffusion models} (DDMs) for discrete dataset generation. Focusing on per-instance differential privacy (pDP), our framework elucidates the potential privacy leakage for each data point in a given training dataset, offering insights into how the privacy loss of each point correlates with the dataset's distribution. Our bounds also show that training with s-sized data points leads to a surge in privacy leakage from (ϵ,O(1s2ϵ))-pDP to (ϵ,O(1sϵ))-pDP of the DDM during the transition from the pure noise to the synthetic clean data phase, and a faster decay in diffusion coefficients amplifies the privacy guarantee. Finally, we empirically verify our theoretical findings on both synthetic and real-world datasets.

Live content is unavailable. Log in and register to view live content