Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Navigating and Addressing Data Problems for Foundation Models (DPFM)

Efficient Global Data Attribution for Diffusion Models

MingYu Lu · Chris Lin · Su-In Lee

Keywords: [ Data Attribution ] [ Diffusion Models ]


Abstract:

With the widespread usage of diffusion models, effective data attribution is needed to ensure fair acknowledgment for contributors of high-quality training samples, and to identify potential sources of harmful content. In this early work, we introduce a novel framework tailored to removal-based data attribution for diffusion models, leveraging sparsified unlearning. This approach significantly improves the computational scalability and effectiveness of removal-based data attribution. In our experiments, we attribute diffusion model FID back to CIFAR-10 training images with datamodel attributions, showing better linear datamodeling score (LDS) than datamodel attributions based on naive retraining.

Chat is not available.