Oral
in
Workshop: 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities
Learning a Thousand Tasks in a Day
Kamil Dreczkowski · Pietro Vitiello · Vitalis Vosylius · Edward Johns
Sat 26 Apr 5:55 p.m. PDT — 3 a.m. PDT
Humans are remarkably efficient at learning tasks from demonstrations, but today's imitation learning methods for robot manipulation often require hundreds or thousands of demonstrations per task. To bridge this gap, we discovered that decomposing reasoning into two sequential phases – object alignment and then object interaction – can enable robots to learn everyday tasks from just a single demonstration. We systematically evaluated this decomposition by comparing different design choices for each phase of reasoning, and by studying the generalisation and scaling trends with respect to today’s dominant paradigm of behavioural cloning with a single-phase monolithic policy. Through 3,450 real-world policy rollouts, we found compelling conclusions that, focussing on efficient learning from few demonstrations per task, decomposition significantly outperforms learning the full trajectory in a single phase, and for each phase, reasoning via retrieval in a learned latent space outperforms behavioural cloning. Building on these insights, we then designed Multi-Task Trajectory Transfer (MT3), a novel imitation learning method based on decomposition and retrieval which is capable of learning everyday manipulation tasks from only a single demonstration each, whilst also generalising efficiently to novel objects. We found that this major leap in data efficiency ultimately enabled us to teach a robot 1000 distinct everyday tasks within just 24 hours of human demonstrator time. Videos of our experiments can be found on our anonymous website https://sites.google.com/view/1000-tasks.