A Statistical Benchmark for Diffusion Posterior Sampling Algorithms
Abstract
We propose a statistical benchmark for diffusion posterior sampling (DPS) algorithms in linear inverse problems. Our test signals are discretized Lévy processes whose posteriors admit efficient Gibbs methods. These Gibbs methods provide gold-standard posterior samples for direct, distribution-level comparisons with DPS algorithms. They can also sample the denoising posteriors in the reverse diffusion, which enables the arbitrary-precision Monte Carlo estimation of various objects that may be needed in the DPS algorithms, such as the expectation or the covariance of the denoising posteriors. In turn, this can be used to isolate algorithmic errors from the errors due to learned components. We instantiate the benchmark with the minimum-mean-squared-error optimality gap and posterior-coverage tests and evaluate popular algorithms on the inverse problems of denoising, deconvolution, imputation, and reconstruction from partial Fourier measurements. We release the benchmark code at https://github.com/emblem-saying/dps-benchmark and invite the community to contribute and report results.