Poster
in
Workshop: World Models: Understanding, Modelling and Scaling
Recurrent world model with tokenized latent states
Guangyao Zhai · Xingyuan Zhang · Nassir Navab
Keywords: [ imitation learning ] [ memory ] [ attention ] [ world model ]
World models are getting more and more popular in recent years. We introduce a new architecture -- TokenWM, that maintains the recurrent nature of state-space models while incorporating tokenized latent states and a memory-augmented attention mechanism to improve modeling capacity in complex environments. The preliminary results on LIBERO benchmarks demonstrate that the new architecture is more favorable to complex tasks than the popular RSSM architecture. We believe TokenWM introduces a new design paradigm for recurrent world models, enabling more expressive and scalable decision-making in complex environments.