On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

Sitan Chen · Xiaoxiao Li · Zhao Song · Danyang Zhuo

Keywords: [ phase retrieval ] [ InstaHide ] [ distributed learning ] [ matrix factorization ]

[ Abstract ]
[ Paper ]
Mon 3 May 1 a.m. PDT — 3 a.m. PDT


In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning. To generate a synthetic training example to be shared among the distributed learners, InstaHide takes a convex combination of private feature vectors and randomly flips the sign of each entry of the resulting vector with probability 1/2. A salient question is whether this scheme is secure in any provable sense, perhaps under a plausible complexity-theoretic assumption.

The answer to this turns out to be quite subtle and closely related to the average-case complexity of a multi-task, missing-data version of the classic problem of phase retrieval that is interesting in its own right. Motivated by this connection, under the standard distributional assumption that the public/private feature vectors are isotropic Gaussian, we design an algorithm that can actually recover a private vector using only the public vectors and a sequence of synthetic vectors generated by InstaHide.

Chat is not available.