Workshop: Generalizable Policy Learning in the Physical World

Don't Freeze Your Embedding: Lessons from Policy Finetuning in Environment Transfer

Victoria Dean · Daniel Toyama · Doina Precup · Victoria Dean


A common occurrence in reinforcement learning (RL) research is making use of a pretrained vision stack that converts image observations to latent vectors. Using a visual embedding in this way leaves open questions, though: should the vision stack be updated with the policy? In this work, we evaluate the effectiveness of such decisions in RL transfer settings. We introduce policy update formulations for use after pretraining in a different environment and analyze the performance of such formulations. Through this evaluation, we also detail emergent metrics of benchmark suites and present results on Atari and AndroidEnv.

Chat is not available.