Poster
in
Workshop: 2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)

A Missing Testbed for LLM Pre-Training Membership Inference Attacks

Mingjian Jiang · Ken Liu · Sanmi Koyejo

Project Page [ OpenReview]

Abstract

We introduce a simple and rigorous testbed for membership inference attacks (MIA) against pre-training sequences for large language models (LLMs). Our testbed addresses the following gaps in existing evaluations, which lack:(1) \textit{uniform} sampling of member/non-member documents of varying lengths from pre-training shards; (2) large-scale \textit{deduplication} at varying strengths, both within and across the sampled members/non-members; and(3) rigorous \textit{statistical tests} to detect member/non-member distribution shifts that cause faulty evaluations and are otherwise imperceptible to the heuristic techniques used in prior work.We provide both global- and domain-level datasets (e.g., Reddit, Stack Exchange, Wikipedia), derived from fully-open pre-trained LLM/dataset pairs including Pythia/Pile, Olmo/Dolma, and our custom pre-trained GPT-2-Large on FineWeb-Edu.We additionally open source a modular and extensible codebase that facilitates the creation of custom, statistically validated, and deduplicated evaluation data using future open models and datasets.In sum, our work is a concrete step towards addressing the evaluation issues discussed by prior work.

Video

Chat is not available.