Workshop
Reincarnating Reinforcement Learning
Rishabh Agarwal · Ted Xiao · Yanchao Sun · Max Schwarzer · Susan Zhang
AD1
Thu 4 May, midnight PDT
Learning “tabula rasa”, that is, from scratch without much previously learned knowledge, is the dominant paradigm in reinforcement learning (RL) research. However, learning tabula rasa is the exception rather than the norm for solving larger-scale problems. Additionally, the inefficiency of tabula rasa RL typically excludes the majority of researchers outside certain resource-rich labs from tackling computationally demanding problems. To address the inefficiencies of tabula rasa RL and help unlock the full potential of deep RL, our workshop aims to bring further attention to this emerging paradigm of reusing prior computation in RL, discuss potential benefits and real-world applications, discuss its current limitations and challenges, and come up with concrete problem statements and evaluation protocols for the research community to work on. Furthermore, we hope to foster discussions via panel discussions (with audience participation), several contributed talks and by welcoming short opinion papers in our call for papers.
Schedule
Thu 12:00 a.m. - 12:10 a.m.
|
Introduction
(
Introduction
)
>
SlidesLive Video |
🔗 |
Thu 12:10 a.m. - 12:40 a.m.
|
Invited Talk by Avishkar Bhoopchand: Human-Timescale Adaptation in an Open-Ended Task Space
(
Invited Talk
)
>
SlidesLive Video |
Avishkar Bhoopchand 🔗 |
Thu 12:40 a.m. - 12:50 a.m.
|
Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals
(
Oral
)
>
link
SlidesLive Video |
Yue Wu · Yewen Fan · Paul Pu Liang · Amos Azaria · Yuanzhi Li · Tom Mitchell 🔗 |
Thu 12:50 a.m. - 1:00 a.m.
|
Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent Reinforcement Learning
(
Oral
)
>
link
SlidesLive Video |
Juan Formanek · Callum R. Tilbury · Jonathan P Shock · Kale-ab Tessera · Arnu Pretorius 🔗 |
Thu 1:00 a.m. - 1:10 a.m.
|
Learning to Modulate pre-trained Models in RL
(
Oral
)
>
link
SlidesLive Video |
Thomas Schmied · Markus Hofmarcher · Fabian Paischer · Razvan Pascanu · Sepp Hochreiter 🔗 |
Thu 1:10 a.m. - 1:20 a.m.
|
Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
(
Oral
)
>
link
SlidesLive Video |
Kolby Nottingham · Prithviraj Ammanabrolu · Alane Suhr · Yejin Choi · Hannaneh Hajishirzi · Sameer Singh · Roy Fox 🔗 |
Thu 1:20 a.m. - 1:30 a.m.
|
Towards A Unified Agent with Foundation Models ( Oral ) > link | Norman Di Palo · Arunkumar Byravan · Leonard Hasenclever · Markus Wulfmeier · Nicolas Heess · Martin Riedmiller 🔗 |
Thu 1:30 a.m. - 1:35 a.m.
|
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies
(
Spotlight
)
>
link
SlidesLive Video |
Daniel Lawson · Ahmed Qureshi 🔗 |
Thu 1:35 a.m. - 1:40 a.m.
|
Deep Reinforcement Learning with Plasticity Injection
(
Spotlight
)
>
link
SlidesLive Video |
Evgenii Nikishin · Junhyuk Oh · Georg Ostrovski · Clare Lyle · Razvan Pascanu · Will Dabney · Andre Barreto 🔗 |
Thu 1:40 a.m. - 1:45 a.m.
|
Synthetic Experience Replay
(
Spotlight
)
>
link
SlidesLive Video |
Cong Lu · Philip Ball · Jack Parker-Holder 🔗 |
Thu 1:45 a.m. - 1:50 a.m.
|
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
(
Spotlight
)
>
link
SlidesLive Video |
14 presentersArjun Majumdar · Karmesh Yadav · Sergio Arnaud · Yecheng Jason Ma · Claire Chen · Sneha Silwal · Aryan Jain · Vincent-Pierre Berges · Pieter Abbeel · Dhruv Batra · Yixin Lin · Oleksandr Maksymets · Aravind Rajeswaran · Franziska Meier |
Thu 1:50 a.m. - 1:55 a.m.
|
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
(
Spotlight
)
>
link
SlidesLive Video |
Mitsuhiko Nakamoto · Yuexiang Zhai · Anikait Singh · Yi Ma · Chelsea Finn · Aviral Kumar · Sergey Levine 🔗 |
Thu 1:55 a.m. - 2:00 a.m.
|
TGRL: Teacher Guided Reinforcement Learning Algorithm for POMDPs
(
Spotlight
)
>
link
SlidesLive Video |
Idan Shenfeld · Zhang-Wei Hong · Aviv Tamar · Pulkit Agrawal 🔗 |
Thu 2:00 a.m. - 2:05 a.m.
|
Co-Imitation Learning without Expert Demonstration ( Spotlight ) > link | Kun-Peng Ning · Hu Xu · Kun Zhu · Sheng-Jun Huang 🔗 |
Thu 2:05 a.m. - 4:00 a.m.
|
Poster Session
(
Poster
)
>
|
🔗 |
Thu 4:00 a.m. - 5:00 a.m.
|
Lunch Break
|
🔗 |
Thu 5:00 a.m. - 5:30 a.m.
|
Invited Talk by Joseph Lim: Skill Reuse in Deep Reinforcement Learning
(
Invited Talk
)
>
SlidesLive Video |
Joseph Lim 🔗 |
Thu 5:30 a.m. - 6:00 a.m.
|
Invited Talk by Furong Hunag: Adaptable Reinforcement Learning in An Ever-Changing World
(
Invited talk
)
>
|
Furong Huang 🔗 |
Thu 6:00 a.m. - 6:30 a.m.
|
Invited Talk by Anna Goldie: RL for Chip Design / LLMs
(
Invited talk
)
>
SlidesLive Video |
Anna Goldie 🔗 |
Thu 6:30 a.m. - 7:00 a.m.
|
Invited Talk by Sergey Levine: Leveraging Offline Datasets / Foundation Models for Real-World RL
(
Invited Talk
)
>
SlidesLive Video |
Sergey Levine 🔗 |
Thu 7:00 a.m. - 7:50 a.m.
|
Panel Discussion: Challenges & Open Problems in Reusing Prior Computation
(
Discussion Panel
)
>
SlidesLive Video |
Joseph Lim · Furong Huang · Marc G Bellemare · Linxi Fan · Jeff Clune · Anna Goldie 🔗 |
Thu 7:50 a.m. - 8:00 a.m.
|
Closing Remarks
(
Closing Remarks
)
>
|
🔗 |
-
|
Unsupervised Object Interaction Learning with Counterfactual Dynamics Models ( Poster ) > link | Jongwook Choi · Sungtae Lee · Xinyu Wang · Sungryull Sohn · Honglak Lee 🔗 |
-
|
Chain-of-Thought Predictive Control with Behavior Cloning ( Poster ) > link | Zhiwei Jia · Fangchen Liu · Vineet Thumuluri · Linghao Chen · Zhiao Huang · Hao Su 🔗 |
-
|
Learning to Modulate pre-trained Models in RL ( Poster ) > link | Thomas Schmied · Markus Hofmarcher · Fabian Paischer · Razvan Pascanu · Sepp Hochreiter 🔗 |
-
|
Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions ( Poster ) > link | Lina Mezghani · Sainbayar Sukhbaatar · Piotr Bojanowski · Karteek Alahari 🔗 |
-
|
Offline Visual Representation Learning for Embodied Navigation ( Poster ) > link | Karmesh Yadav · Ram Ramrakhya · Arjun Majumdar · Vincent-Pierre Berges · Sachit Kuhar · Dhruv Batra · Alexei Baevski · Oleksandr Maksymets 🔗 |
-
|
Imitation from Arbitrary Experience: A Dual Unification of Reinforcement and Imitation Learning Methods ( Poster ) > link | Harshit Sikchi · Amy Zhang · Scott Niekum 🔗 |
-
|
Self-Generating Data for Goal-Conditioned Compositional Problems ( Poster ) > link | Ying Yuan · Yunfei Li · Yi Wu 🔗 |
-
|
LIV: Language-Image Representations and Rewards for Robotic Control ( Poster ) > link | Yecheng Jason Ma · Vikash Kumar · Amy Zhang · Osbert Bastani · Dinesh Jayaraman 🔗 |
-
|
Model-Based Adversarial Imitation Learning As Online Fine-Tuning ( Poster ) > link | Rafael Rafailov · Victor Kolev · Kyle Hatch · John Martin · mariano Phielipp · Jiajun Wu · Chelsea Finn 🔗 |
-
|
MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning ( Poster ) > link | Rafael Rafailov · Kyle Hatch · Victor Kolev · John Martin · mariano Phielipp · Chelsea Finn 🔗 |
-
|
Masked Trajectory Models for Prediction, Representation, and Control ( Poster ) > link | Philipp Wu · Arjun Majumdar · Kevin Stone · Yixin Lin · Igor Mordatch · Pieter Abbeel · Aravind Rajeswaran 🔗 |
-
|
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? ( Poster ) > link |
14 presentersArjun Majumdar · Karmesh Yadav · Sergio Arnaud · Yecheng Jason Ma · Claire Chen · Sneha Silwal · Aryan Jain · Vincent-Pierre Berges · Pieter Abbeel · Dhruv Batra · Yixin Lin · Oleksandr Maksymets · Aravind Rajeswaran · Franziska Meier |
-
|
Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning ( Poster ) > link | Patrick Emedom-Nnamdi · Abram Friesen · Bobak Shahriari · Matthew Hoffman · Nando de Freitas 🔗 |
-
|
Synthetic Experience Replay ( Poster ) > link | Cong Lu · Philip Ball · Jack Parker-Holder 🔗 |
-
|
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges ( Poster ) > link | Massimo Caccia · Jonas Mueller · Taesup Kim · Laurent Charlin · Rasool Fakoor 🔗 |
-
|
Towards A Unified Agent with Foundation Models ( Poster ) > link | Norman Di Palo · Arunkumar Byravan · Leonard Hasenclever · Markus Wulfmeier · Nicolas Heess · Martin Riedmiller 🔗 |
-
|
EDGI: Equivariant diffusion for planning with embodied agents ( Poster ) > link | Johann Brehmer · Joey Bose · Pim De Haan · Taco Cohen 🔗 |
-
|
Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent Reinforcement Learning ( Poster ) > link | Juan Formanek · Callum R. Tilbury · Jonathan P Shock · Kale-ab Tessera · Arnu Pretorius 🔗 |
-
|
Beyond Temporal Credit Assignment in Reinforcement Learning ( Poster ) > link | Sephora Madjiheurem · Kimberly Stachenfeld · Peter Battaglia · Jessica Hamrick 🔗 |
-
|
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration ( Poster ) > link | Chentian Jiang · Nan Rosemary Ke · Hado van Hasselt 🔗 |
-
|
Bootstrapped Representations in Reinforcement Learning ( Poster ) > link | Charline Le Lan · Stephen Tu · Mark Rowland · Anna Harutyunyan · Rishabh Agarwal · Marc G Bellemare · Will Dabney 🔗 |
-
|
Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models ( Poster ) > link | Xingyuan Zhang · Philip Becker-Ehmck · Patrick van der Smagt · Maximilian Karl 🔗 |
-
|
Deep Reinforcement Learning with Plasticity Injection ( Poster ) > link | Evgenii Nikishin · Junhyuk Oh · Georg Ostrovski · Clare Lyle · Razvan Pascanu · Will Dabney · Andre Barreto 🔗 |
-
|
On The Role of Forgetting in Fine-Tuning Reinforcement Learning Models ( Poster ) > link | Maciej Wołczyk · Bartłomiej Cupiał · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś 🔗 |
-
|
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies ( Poster ) > link | Daniel Lawson · Ahmed Qureshi 🔗 |
-
|
TGRL: Teacher Guided Reinforcement Learning Algorithm for POMDPs ( Poster ) > link | Idan Shenfeld · Zhang-Wei Hong · Aviv Tamar · Pulkit Agrawal 🔗 |
-
|
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav ( Poster ) > link | Ram Ramrakhya · Dhruv Batra · Erik Wijmans · Abhishek Das 🔗 |
-
|
Bayesian regularization of empirical MDPs ( Poster ) > link | Samarth Gupta · Daniel Hill · Lexing Ying · Inderjit Dhillon 🔗 |
-
|
Prioritized offline Goal-swapping Experience Replay ( Poster ) > link | Wenyan Yang · Joni Pajarinen · Dingding Cai · Joni-Kristian Kamarainen 🔗 |
-
|
Revisiting Behavior Regularized Actor-Critic ( Poster ) > link | Denis Tarasov · Vladislav Kurenkov · Alexander Nikulin · Sergey Kolesnikov 🔗 |
-
|
Successor Feature Representations ( Poster ) > link | Chris Reinke · Xavier Alameda-Pineda 🔗 |
-
|
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories ( Poster ) > link | Qinqing Zheng · Mikael Henaff · Brandon Amos · Aditya Grover 🔗 |
-
|
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning ( Poster ) > link | Mitsuhiko Nakamoto · Yuexiang Zhai · Anikait Singh · Yi Ma · Chelsea Finn · Aviral Kumar · Sergey Levine 🔗 |
-
|
Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning ( Poster ) > link | Md Masudur Rahman · Yexiang Xue 🔗 |
-
|
Co-Imitation Learning without Expert Demonstration ( Poster ) > link | Kun-Peng Ning · Hu Xu · Kun Zhu · Sheng-Jun Huang 🔗 |
-
|
Instruction-Finetuned Foundation Models for Multimodal Web Navigation ( Poster ) > link | Hiroki Furuta · Ofir Nachum · Kuang-Huei Lee · Yutaka Matsuo · Shixiang Gu · Izzeddin Gur 🔗 |
-
|
Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling ( Poster ) > link | Kolby Nottingham · Prithviraj Ammanabrolu · Alane Suhr · Yejin Choi · Hannaneh Hajishirzi · Sameer Singh · Roy Fox 🔗 |
-
|
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets ( Poster ) > link | David Venuto · Sherry Yang · Pieter Abbeel · Doina Precup · Igor Mordatch · Ofir Nachum 🔗 |
-
|
Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals ( Poster ) > link | Yue Wu · Yewen Fan · Paul Pu Liang · Amos Azaria · Yuanzhi Li · Tom Mitchell 🔗 |