In-Person Poster presentation / top 5% paper

Emergence of Maps in the Memories of Blind Navigation Agents

Erik Wijmans · Manolis Savva · Irfan Essa · Stefan Lee · Ari Morcos · Dhruv Batra

MH1-2-3-4 #106

Keywords: [ Reinforcement Learning ] [ Embodied AI ] [ navigation ] [ characterizing representations ]

award Outstanding Paper
[ ] [ Project Page ]

Abstract: Animal navigation research posits that organisms build and maintain internal spa- tial representations, or maps, of their environment. We ask if machines – specifically, artificial intelligence (AI) navigation agents – also build implicit (or ‘mental’) maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Unlike animal navigation, we can judiciously design the agent’s perceptual system and control the learning paradigm to nullify alternative navigation mechanisms. Specifically, we train ‘blind’ agents – with sensing limited to only egomotion and no other sensing of any kind – to perform PointGoal navigation (‘go to $\Delta$x, $\Delta$y’) via reinforcement learning. Our agents are composed of navigation-agnostic components (fully-connected and recurrent neural networks), and our experimental setup provides no inductive bias towards mapping. Despite these harsh conditions, we find that blind agents are (1) surprisingly effective navigators in new environments (∼95% success); (2) they utilize memory over long horizons (remembering ∼1,000 steps of past experience in an episode); (3) this memory enables them to exhibit intelligent behavior (following walls, detecting collisions, taking shortcuts); (4) there is emergence of maps and collision detection neurons in the representations of the environment built by a blind agent as it navigates; and (5) the emergent maps are selective and task dependent (e.g. the agent ‘forgets’ exploratory detours). Overall, this paper presents no new techniques for the AI audience, but a surprising finding, an insight, and an explanation.

Chat is not available.