Skip to yearly menu bar Skip to main content

Workshop: Reincarnating Reinforcement Learning

Model-Based Adversarial Imitation Learning As Online Fine-Tuning

Rafael Rafailov · Victor Kolev · Kyle Hatch · John Martin · mariano Phielipp · Jiajun Wu · Chelsea Finn


In many real world applications of sequential decision-making problems, such as robotics or autonomous driving, expert-level data is available (or easily obtainable) with methods such as tele-operation. However, directly learning to copy these expert behaviours can result in poor performance due to distribution shift at deployment time. Adversarial imitation learning algorithms alleviate this issue by learning to match the expert state-action distribution through additional environment interactions. Such methods are built around standard reinforcement-learning algorithms with both model-based and model-free approaches. In this work we focus on the model-based approach and argue that algorithms developed for online RL are sub-optimal for the distribution matching problem. We theoretically justify utilizing conservative algorithms developed for the offline learning paradigm in online adversarial imitation learning and empirically demonstrate improved performance and safety on a complex long-range robot manipulation task, directly from images.

Chat is not available.