Pareto Variational Autoencoder
Mincheol Cho ⋅ Yedarm Seong ⋅ Joong-Ho Won
Abstract
This paper introduces a new class of multivariate power-law distributions---the symmetric Pareto (symPareto) distribution---which can be viewed as an $\ell_1$-norm-based counterpart of the multivariate $t$ distribution, with the motivation of capturing the heavy tail of the target distribution in generative modeling and bringing robustness to noise in downstream tasks such as image denoising. The symPareto distribution possesses many attractive information-geometric properties with respect to the $\gamma$-power divergence that %naturally %\red{characterizes the geometric structures of power-law families.} is a natural alternative to the Kullback-Leibler divergence, the core of the conventional variational autoencoder (VAE) models, for power families. Leveraging on the joint minimization view of variational inference, this paper proposes the ParetoVAE, a probabilistic autoencoder that minimizes the $\gamma$-power divergence between two statistical manifolds. ParetoVAE employs the symPareto distribution for both prior and encoder, with flexible decoder options including multivariate $t$ and symPareto distributions. Empirical evidences demonstrate the effectiveness of ParetoVAE across multiple domains through varying the types of the decoder. The $t$ decoder achieves superior performance in sparse, heavy-tailed data reconstruction and word frequency analysis; the symPareto decoder enables robust high-dimensional denoising.
Successful Page Load