Skip to yearly menu bar Skip to main content


Poster

Laplace Sample Information: Data Informativeness Through a Bayesian Lens

Johannes Kaiser · Kristian Schwethelm · Daniel Rueckert · Georgios Kaissis

Hall 3 + Hall 2B #414
[ ]
Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract: Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose $\text{\textit{Laplace Sample Information}}$ ($\mathsf{LSI}$) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings.$\mathsf{LSI}$ leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset.We experimentally show that $\mathsf{LSI}$ is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty.We demonstrate these capabilities of $\mathsf{LSI}$ on image and text data in supervised and unsupervised settings.Moreover, we show that $\mathsf{LSI}$ can be computed efficiently through probes and transfers well to the training of large models.

Live content is unavailable. Log in and register to view live content