ICLR MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning

Poster
in
Workshop: Reincarnating Reinforcement Learning

MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning

Rafael Rafailov · Kyle Hatch · Victor Kolev · John Martin · mariano Phielipp · Chelsea Finn

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We study the problem of offline-to-online reinforcement learning from high-dimensional pixel observations. While recent model-free approaches successfully use offline pre-training with online fine-tuning to either improve the performance of the data-collection policy or adapt to novel tasks, model-based approaches still remain underutilized in this setting. In this work, we argue that existing methods for high-dimensional model-based offline RL are not suitable for offline-to-online fine-tuning due to issues with representation learning shifts, off-dynamics data, and non-stationary rewards. We propose a simple on-policy model-based method with adaptive behavior regularization. In our simulation experiments, we find that our approach successfully solves long-horizon robot manipulation tasks completely from images by using a combination of offline data and online interactions.

Chat is not available.

Poster in Workshop: Reincarnating Reinforcement Learning

MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning

Rafael Rafailov · Kyle Hatch · Victor Kolev · John Martin · mariano Phielipp · Chelsea Finn

Poster
in
Workshop: Reincarnating Reinforcement Learning