Deep generative models of speech have recently surpassed classic algorithms and achieved state-of-the-art performance in text-to-speech (TTS). Specifically, the WaveNet model has significantly increased the quality of generated speech and it has already been transferred into a fast architecture enabling its use in real world with Google Assistant. On the other hand, new developments in Deep RL agents such as the IMPALA architecture have resulted in increased performance across 30 complex 3D environments and have demonstrated positive transfer whilst solving all tasks at the same time. In my talk, I will first explain the recent developments in the WaveNet project and its application as a production TTS system at Google. I will then explain our most recent agent framework: IMPALA and demonstrate its performance on the new DMLab30 challenge domain. Finally, I will introduce a very recent work combining these two general research directions --the SPIRAL algorithm-- an agent that can learn a generative model of images by creating a visual program over brush strokes.
( events) Timezone: »
Wed May 02 09:00 AM -- 09:45 AM (PDT) @ Exhibition Hall A
From Generative Models to Generative Agents
In Wed AM Talks