Processing math: 100%
Skip to yearly menu bar Skip to main content


Poster

Laplace Sample Information: Data Informativeness Through a Bayesian Lens

Johannes Kaiser · Kristian Schwethelm · Daniel Rueckert · Georgios Kaissis

Hall 3 + Hall 2B #414
[ ]
Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract: Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose \textit{Laplace Sample Information} (LSI) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings.LSI leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset.We experimentally show that LSI is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty.We demonstrate these capabilities of LSI on image and text data in supervised and unsupervised settings.Moreover, we show that LSI can be computed efficiently through probes and transfers well to the training of large models.

Live content is unavailable. Log in and register to view live content