Spotlight
Learning a Latent Simplex in Input Sparsity Time
Ainesh Bakshi · Chiranjib Bhattacharyya · Ravi Kannan · David Woodruff · Samson Zhou
Abstract:
We consider the problem of learning a latent -vertex simplex , given , which can be viewed as data points that are formed by randomly perturbing some latent points in , possibly beyond . A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast in this view of learning a latent simplex. Bhattacharyya and Kannan (SODA 2020) give an algorithm for learning such a -vertex latent simplex in time roughly , where is the number of non-zeros in . We show that the dependence on in the running time is unnecessary given a natural assumption about the mass of the top singular values of , which holds in many of these applications. Further, we show this assumption is necessary, as otherwise an algorithm for learning a latent simplex would imply a better low rank approximation algorithm than what is known.
We obtain a spectral low-rank approximation to in input-sparsity time and show that the column space thus obtained has small (angular) distance to the right top- singular space of . Our algorithm then selects points in the low-rank subspace with the largest inner product (in absolute value) with carefully chosen random vectors. By working in the low-rank subspace, we avoid reading the entire matrix in each iteration and thus circumvent the running time.
Chat is not available.