Private Linear Regression via a Down-Sensitivity to Privacy Reduction
Abstract
We present a sample- and time-efficient (epsilon, delta)-differentially private algorithm for d-dimensional linear regression. Our algorithm achieves sample complexity n = \tilde{O}(d / alpha^2 + d log(1/delta) / (alpha epsilon) + d log(1/delta) / epsilon) + o(d), improving over prior polynomial-time approaches. In contrast to DP-SGD, whose guarantees depend on the condition number of the design matrix, Sum-of-Squares methods that scale quadratically with dimension, and outlier-removal approaches that scale quadratically with the privacy parameter, our method avoids all three limitations. Our algorithm is based on a novel subsample-test-aggregate (STA) framework for ensuring privacy given only bounded down-sensitivity -- robustness to removal, but not addition, of a small number of samples. The intuition that down-sensitivity should be related to privacy is not new, but STA formalizes this by providing an efficient black-box reduction from down-sensitivity to privacy which we expect to be applicable beyond the setting of linear regression.