Controlling generative models with continuous factors of variations

Antoine Plumerault, Hervé Le Borgne, Céline Hudelot

Keywords: computer vision, gan, generative models, interpretability, nlp

Abstract: Recent deep generative models can provide photo-realistic images as well as visual or textual content embeddings useful to address various tasks of computer vision and natural language processing. Their usefulness is nevertheless often limited by the lack of control over the generative process or the poor understanding of the learned representation. To overcome these major issues, very recent works have shown the interest of studying the semantics of the latent space of generative models. In this paper, we propose to advance on the interpretability of the latent space of generative models by introducing a new method to find meaningful directions in the latent space of any generative model along which we can move to control precisely specific properties of the generated image like position or scale of the object in the image. Our method is weakly supervised and particularly well suited for the search of directions encoding simple transformations of the generated image, such as translation, zoom or color variations. We demonstrate the effectiveness of our method qualitatively and quantitatively, both for GANs and variational auto-encoders.

Similar Papers

On the "steerability" of generative adversarial networks
Ali Jahanian, Lucy Chai, Phillip Isola,
High Fidelity Speech Synthesis with Adversarial Networks
Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan,
Counterfactuals uncover the modular structure of deep generative models
Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf,