Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

Robust Sparse Mean Estimation via Incremental Learning

Jianhao Ma · Rui Chen · Yinghui He · Salar Fattahi · Wei Hu


Abstract: In this paper, we study the problem of robust sparse mean estimation, where the goal is to estimate a $k$-sparse mean from a collection of samples drawn from a heavy-tailed distribution, a subset of which are grossly tampered by a strong adversary. Existing estimators face two critical challenges in this setting. First, they rely on prior knowledge of the sparsity level $k$. Second, they fall short of practical use as they scale poorly with the ambient dimension. This paper presents a simple mean estimator that overcomes both challenges under a moderate condition: it works without requiring prior knowledge of $k$ and runs in near-linear time and memory (with respect to the ambient dimension). Moreover, with a mild assumption on the signal-to-noise ratio, we show that our provided guarantee is information-theoretically optimal, thereby breaking a well-known computational-statistical trade-off. At the core of our method lies an incremental learning phenomenon: we show that subgradient method applied to a two-layer diagonal linear neural network with $\ell_1$-loss can incrementally learn the top-$k$ nonzero elements of the mean, while keeping the zero elements arbitrarily small. Our extensive experiments show that our proposed method outperforms the existing methods in terms of both efficiency and robustness. More broadly, our results highlight the emergence of robust incremental learning, even for heavy-tailed distributions and under strong contamination models.

Chat is not available.