Poster
Efficient Sparse PCA via Block-Diagonalization
Alberto Del Pia · Dekun Zhou · Yinglun Zhu
Hall 3 + Hall 2B #363
[
Abstract
]
Thu 24 Apr 7 p.m. PDT
— 9:30 p.m. PDT
Abstract:
Sparse Principal Component Analysis (Sparse PCA) is a pivotal tool in data analysis and dimensionality reduction. However, Sparse PCA is a challenging problem in both theory and practice: it is known to be NP-hard and current exact methods generally require exponential runtime. In this paper, we propose a novel framework to efficiently approximate Sparse PCA by (i) approximating the general input covariance matrix with a re-sorted block-diagonal matrix, (ii) solving the Sparse PCA sub-problem in each block, and (iii) reconstructing the solution to the original problem. Our framework is simple and powerful: it can leverage any off-the-shelf Sparse PCA algorithm and achieve significant computational speedups, with a minor additive error that is linear in the approximation error of the block-diagonal matrix. Suppose g(k,d) is the runtime of an algorithm (approximately) solving Sparse PCA in dimension d and with sparsity constant k. Our framework, when integrated with this algorithm, reduces the runtime to O(dd⋆⋅g(k,d⋆)+d2), where d⋆≤d is the largest block size of the block-diagonal matrix. For instance, integrating our framework with the Branch-and-Bound algorithm reduces the complexity from g(k,d)=O(k3⋅dk) to O(k3⋅d⋅(d⋆)k−1), demonstrating exponential speedups if d⋆ is small. We perform large-scale evaluations on many real-world datasets: for exact Sparse PCA algorithm, our method achieves an average speedup factor of 100.50, while maintaining an average approximation error of 0.61%; for approximate Sparse PCA algorithm, our method achieves an average speedup factor of 6.00 and an average approximation error of -0.91%, meaning that our method oftentimes finds better solutions.
Live content is unavailable. Log in and register to view live content