ICLR 2023 Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve Oral

In-Person Oral presentation / top 5% paper

Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Juhan Bae · Michael Zhang · Michael Ruan · Duanyang Wang · So Hasegawa · Jimmy Ba · Roger Grosse

AD12

[ Abstract ] [ Visit Oral 3 Track 2: Deep Learning and representational learning ]

Abstract: Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable should retain. This trade-off between the reconstruction error (distortion) and the KL divergence (rate) is typically parameterized by a hyperparameter

β

$\beta$ . In this paper, we introduce Multi-Rate VAE (MR-VAE), a computationally efficient framework for learning optimal parameters corresponding to various

β

$\beta$ in a single training run. The key idea is to explicitly formulate a response function using hypernetworks that maps

β

$\beta$ to the optimal parameters. MR-VAEs construct a compact response hypernetwork where the pre-activations are conditionally gated based on

β

$\beta$ . We justify the proposed architecture by analyzing linear VAEs and showing that it can represent response functions exactly for linear VAEs. With the learned hypernetwork, MR-VAEs can construct the rate-distortion curve without additional training and can be deployed with significantly less hyperparameter tuning. Empirically, our approach is competitive and often exceeds the performance of multiple

β

$\beta$ -VAEs training with minimal computation and memory overheads.

Chat is not available.