Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Navigating and Addressing Data Problems for Foundation Models (DPFM)

[***Online Presentation***] Distributional Dataset Distillation with Subtask Decomposition

Tian Qin · Zhiwei Deng · David Alvarez-Melis

Keywords: [ synthetic data generation ] [ Data Reduction ] [ dataset distillation ]


Abstract:

What does a neural network learn when training from a task-specific dataset? Synthesizing this knowledge is the central idea behind Dataset Distillation, which recent work has shown can be used to compress large datasets into a small set of input-label pairs (prototypes) that capture essential aspects of the original dataset. In this paper, we make the key observation that existing methods distilling into explicit prototypes is very often suboptimal, incurring in unexpected storage cost from distilled labels. In response, we propose Distributional Dataset Distillation (D3), which encodes the data using minimal sufficient per-class statistics and a decoder, resulting in a compact representation that is more memory-efficient compared to prototype-based methods. To scale up the process of learning these representations, we propose Federated distillation, which decomposes the dataset into subsets, distills them in parallel using sub-task experts and then re-aggregates them. We thoroughly evaluate our algorithm on a three-dimensional metric and show that our method achieves state-of-the-art results on TinyImageNet and ImageNet-1K. Specifically, we outperform the prior art by 5.6% on ImageNet-1K under 2 image per class storage budget.

Chat is not available.