Skip to yearly menu bar Skip to main content


Task Planning for Visual Room Rearrangement under Partial Observability

Karan Mirakhor · Sourav Ghosh · DIPANJAN DAS · Brojeshwar Bhowmick

Halle B #68
[ ]
Tue 7 May 1:45 a.m. PDT — 3:45 a.m. PDT


This paper presents a novel hierarchical task planner under partial observabilitythat empowers an embodied agent to use visual input to efficiently plan a sequenceof actions for simultaneous object search and rearrangement in an untidy room,to achieve a desired tidy state. The paper introduces (i) a novel Search Networkthat utilizes commonsense knowledge from large language models to find unseenobjects, (ii) a Deep RL network trained with proxy reward, along with (iii) a novelgraph-based state representation to produce a scalable and effective planner thatinterleaves object search and rearrangement to minimize the number of steps takenand overall traversal of the agent, as well as to resolve blocked goal and swapcases, and (iv) a sample-efficient cluster-biased sampling for simultaneous trainingof the proxy reward network along with the Deep RL network. Furthermore,the paper presents new metrics and a benchmark dataset - RoPOR, to measurethe effectiveness of rearrangement planning. Experimental results show that ourmethod significantly outperforms the state-of-the-art rearrangement methods Weihset al. (2021a); Gadre et al. (2022); Sarch et al. (2022); Ghosh et al. (2022).

Chat is not available.