ICLR Poster Improved Analysis of Sparse Linear Regression in Local Differential Privacy Model

Poster

Improved Analysis of Sparse Linear Regression in Local Differential Privacy Model

Liyang Zhu · Meng Ding · Vaneet Aggarwal · Jinhui Xu · Di Wang

Halle B #214

[ Abstract ]

[ Poster] [ OpenReview]

Abstract: In this paper, we revisit the problem of sparse linear regression in the local differential privacy (LDP) model. Existing research in the non-interactive and sequentially local models has focused on obtaining the lower bounds for the case where the underlying parameter is

$1$ -sparse, and extending such bounds to the more general

$k$ -sparse case has proven to be challenging. Moreover, it is unclear whether efficient non-interactive LDP (NLDP) algorithms exist. To address these issues, we first consider the problem in the

$\epsilon$ non-interactive LDP model and provide a lower bound of

$\Omega(\frac{\sqrt{dk\log d}}{\sqrt{n}\epsilon})$ on the

$\ell_2$ -norm estimation error for sub-Gaussian data, where

$n$ is the sample size and

$d$ is the dimension of the space. We propose an innovative NLDP algorithm, the very first of its kind for the problem. As a remarkable outcome, this algorithm also yields a novel and highly efficient estimator as a valuable by-product. Our algorithm achieves an upper bound of

$\tilde{O}({\frac{d\sqrt{k}}{\sqrt{n}\epsilon}})$ for the estimation error when the data is sub-Gaussian, which can be further improved by a factor of

$O(\sqrt{d})$ if the server has additional public but unlabeled data. For the sequentially interactive LDP model, we show a similar lower bound of

$\Omega({\frac{\sqrt{dk}}{\sqrt{n}\epsilon}})$ . As for the upper bound, we rectify a previous method and show that it is possible to achieve a bound of

$\tilde{O}(\frac{k\sqrt{d}}{\sqrt{n}\epsilon})$ . Our findings reveal fundamental differences between the non-private case, central DP model, and local DP model in the sparse linear regression problem.

Chat is not available.