Poster
in
Workshop: Workshop on Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference
Evaluating LLM Memorization Using Soft Token Sparsity
Zhili Feng · Yixuan Xu · Alexander Robey · Avi Schwarzschild · Zico Kolter
Large language models (LLMs) have been shown to memorize portions of their training data, posing threats to privacy and copyright protection. Many previous studies have attempted to define memorization in a practical way to enable scalable detection. In this work, we investigate compressive memorization and address its key limitation--computational inefficiency. To this end, we propose the adversarial sparsity ratio (ASR) as a proxy for compressive memorization. ASR identifies sparse soft prompts that reconstruct target sequences, enabling a more computationally tractable assessment of memorization. Empirically, we show that ASR effectively distinguishes between memorized and non-memorized content, both within and across models. Furthermore, beyond verbatim memorization, ASR also captures memorization of underlying knowledge, offering a scalable and interpretable tool for analyzing memorization in LLMs.