Processing math: 100%
Skip to yearly menu bar Skip to main content


Poster

Quantum (Inspired) D2-sampling with Applications

Poojan Shah · Ragesh Jaiswal

Hall 3 + Hall 2B #454
[ ]
Wed 23 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract: D2-sampling is a fundamental component of sampling-based clustering algorithms such as k-means++. Given a dataset VRd with N points and a center set CRd, D2-sampling refers to picking a point from V where the sampling probability of a point is proportional to its squared distance from the nearest center in C.The popular k-means++ algorithm is simply a k-round D2-sampling process, which runs in O(Nkd) time and gives O(logk)-approximation in expectation for the k-means problem.In this work, we give a quantum algorithm for (approximate) D2-sampling in the QRAM model that results in a quantum implementation of k-means++ with a running time ˜O(ζ2k2). Here ζ is the aspect ratio ( i.e., largest to smallest interpoint distance) and ˜O hides polylogarithmic factors in N,d,k.It can be shown through a robust approximation analysis of k-means++ that the quantum version preserves its O(logk) approximation guarantee.Further, we show that our quantum algorithm for D2-sampling can be dequantized using the sample-query access model of Tang (PhD Thesis, Ewin Tang, University of Washington, 2023). This results in a fast quantum-inspired classical implementation of k-means++, which we call QI-k-means++, with a running time O(Nd)+˜O(ζ2k2d), where the O(Nd) term is for setting up the sample-query access data structure.Experimental investigations show promising results for QI-k-means++ on large datasets with bounded aspect ratio.Finally, we use our quantum D2-sampling with the known D2-sampling-based classical approximation scheme to obtain the first quantum approximation scheme for the k-means problem with polylogarithmic running time dependence on N.

Live content is unavailable. Log in and register to view live content