ICLR Poster Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms

Poster

Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms

Yi Li · Honghao Lin · David Woodruff

Halle B #243

[ Abstract ]

[ OpenReview]

Abstract: We study the problem of residual error estimation for matrix and vector norms using a linear sketch. Such estimates can be used, for example, to quickly assess how useful a more expensive low-rank approximation computation will be. The matrix case concerns the Frobenius norm and the task is to approximate the

$k$ -residual

$\|A - A_k\|_F$ of the input matrix

$A$ within a

$(1+\epsilon)$ -factor, where

$A_k$ is the optimal rank-

$k$ approximation. We provide a tight bound of

$\Theta(k^2/\epsilon^4)$ on the size of bilinear sketches, which have the form of a matrix product

$SAT$ . This improves the previous

$O(k^2/\epsilon^6)$ upper bound in (Andoni et al. SODA 2013) and gives the first non-trivial lower bound, to the best of our knowledge. In our algorithm, our sketching matrices

$S$ and

$T$ can both be sparse matrices, allowing for a very fast update time. We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work. For the vector case, we consider the

$\ell_p$ -norm for

$p>2$ , where the task is to approximate the

$k$ -residual

$\|x - x_k\|_p$ up to a constant factor, where

$x_k$ is the optimal

$k$ -sparse approximation to

$x$ . Such vector norms are frequently studied in the data stream literature and are useful for finding frequent items or so-called heavy hitters. We establish an upper bound of

$O(k^{2/p}n^{1-2/p}\operatorname{poly}(\log n))$ for constant

$\epsilon$ on the dimension of a linear sketch for this problem. Our algorithm can be extended to the

$\ell_p$ sparse recovery problem with the same sketching dimension, which seems to be the first such bound for

$p > 2$ . We also show an

$\Omega(k^{2/p}n^{1-2/p})$ lower bound for the sparse recovery problem, which is tight up to a

$\mathrm{poly}(\log n)$ factor.

Chat is not available.