Riemannian Metric Matching for Scalable Geometric Modelling of Distributions
Abstract
High-dimensional datasets often concentrate near low-dimensional structures, but estimating their geometry from samples typically relies on graphs and kernels that scale poorly with dataset size and dimension. We propose Riemannian metric matching: a denoising probabilistic framework for learning the Riemannian geometry of data using neural networks. Specifically, we learn the carré du champ operator, which, using diffusion geometry, gives us access to the Riemannian geometry toolkit for downstream machine learning and statistical tasks. Our key observation is that the carré du champ operator can be formulated as a conditional expectation over random perturbations of the data, which can be exploited for sample-wise training and constant cost, amortized inference without explicit kernel construction. To the best of our knowledge, we provide the first neural surrogate that estimates the underlying Riemannian geometry of data with a provable consistency guarantee in the large data limit. Empirically, metric matching rivals or improves the accuracy of graph-based diffusion geometry estimators, while enabling amortized inference that is up to faster, and supports graph-free geometric analysis on high-dimensional images where nearest neighbors break down.