Skip to yearly menu bar Skip to main content


Invited Talk

From Generative Models to Generative Agents

Koray Kavukcuoglu

Exhibition Hall A

Abstract:

Deep generative models of speech have recently surpassed classic algorithms and achieved state-of-the-art performance in text-to-speech (TTS). Specifically, the WaveNet model has significantly increased the quality of generated speech and it has already been transferred into a fast architecture enabling its use in real world with Google Assistant. On the other hand, new developments in Deep RL agents such as the IMPALA architecture have resulted in increased performance across 30 complex 3D environments and have demonstrated positive transfer whilst solving all tasks at the same time. In my talk, I will first explain the recent developments in the WaveNet project and its application as a production TTS system at Google. I will then explain our most recent agent framework: IMPALA and demonstrate its performance on the new DMLab30 challenge domain. Finally, I will introduce a very recent work combining these two general research directions --the SPIRAL algorithm-- an agent that can learn a generative model of images by creating a visual program over brush strokes.

Live content is unavailable. Log in and register to view live content