Poster
in
Workshop: The 2nd Workshop on World Models: Understanding, Modelling and Scaling

Lifting Ego World Models for Planning and Control

Alex N. Wang ⋅ trevor darrell ⋅ Pavel Izmailov ⋅ Yutong Bai ⋅ Amir Bar

Project Page [ OpenReview]

Abstract

World models have shown remarkable ability to predict future observations from high-dimensional action inputs, but planning in complex action spaces like human joint movement remains a difficult and unsolved problem. Inspired by hierarchical control in humans, we design a goal-conditioned controller policy to generate low-level joint actions conditioned on high-level waypoint inputs. Leveraging waypoint goal-conditioning and short-term motion patterns, we combine our policy with a low-level PEVA world model, lifting it its input to the high-level waypoint space. First, we show that waypoint goal conditioning improves Mean Joint Error (MJE) for a human-like agent by $5.8\times$ while being easily controllable and generalizing to unseen actions. Next, we perform visuomotor planning with the lifted PEVA world model for hybrid navigation-interaction tasks in the Nymeria dataset, improving MJE by up to $4.7\times$, while being more efficient and generalizing to entirely unseen environments.

Chat is not available.