Skip to yearly menu bar Skip to main content


Poster
in
Workshop: A Roadmap to Never-Ending RL

Persistent Reinforcement Learning via Subgoal Curricula

Archit Sharma · Abhishek Gupta · Karol Hausman · Sergey Levine · Chelsea Finn


Abstract:

Reinforcement learning (RL) is becoming increasingly more successful for robotics beyond simulated environments. However, the success of such reinforcement learning systems is predicated on the often under-emphasised reset mechanism -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial often requires extensive instrumentation and engineering effort in the real world, or manual human supervision to orchestrate environmental resets, which defeats the purpose of autonomous reinforcement learning. In this work, we formalize \textit{persistent reinforcement learning}: a problem setting that explicitly factors in that the environment resets are not freely available. We then introduce Value-accelerated Persistent Reinforcement Learning (VaPRL), which learns efficiently on a constrained budget of resets by generating a curriculum of increasingly harder tasks converging to the evaluation setting. We observe that our proposed algorithm requires only a handful of environmental resets, reducing the requirement by several orders of magnitude while outperforming competitive baselines on several continuous control environments. Overall, we hope that the reduced reliance on environmental resets can enable agents to learn with greater autonomy in the real world.

Chat is not available.