Dynamics-Aware Embeddings

William Whitney; Rajat Agarwal; Kyunghyun Cho; Abhinav Gupta

Abstract: In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and actions. These embeddings capture the structure of the environment's dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.

Dynamics-Aware Embeddings

William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta

Similar Papers

Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning

Noah Siegel, Jost Tobias Springenberg, Felix Berkenkamp, Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller,

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine,

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou,