ICLR Poster Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

In-Person Poster presentation / poster accept

Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

Dianbo Liu · Vedant Shah · Oussama Boussif · Cristian Meo · Anirudh Goyal · Tianmin Shu · Michael Mozer · Nicolas Heess · Yoshua Bengio

MH1-2-3-4 #97

[ Abstract ]

[ OpenReview]

Abstract:

In cooperative multi-agent reinforcement learning, a team of agents works togetherto achieve a common goal. Different environments or tasks may require varyingdegrees of coordination among agents in order to achieve the goal in an optimalway. The nature of coordination will depend on properties of the environment—itsspatial layout, distribution of obstacles, dynamics, etc. We term this variationof properties within an environment as heterogeneity. Existing literature has notsufficiently addressed the fact that different environments may have different levelsof heterogeneity. We formalize the notions of coordination level and heterogeneitylevel of an environment and present HECOGrid, a suite of multi-agent RLenvironments that facilitates empirical evaluation of different MARL approachesacross different levels of coordination and environmental heterogeneity by providinga quantitative control over coordination and heterogeneity levels of theenvironment. Further, we propose a Centralized Training Decentralized Executionlearning approach called Stateful Active Facilitator (SAF) that enables agents towork efficiently in high-coordination and high-heterogeneity environments througha differentiable and shared knowledge source used during training and dynamicselection from a shared pool of policies. We evaluate SAF and compare its performanceagainst baselines IPPO and MAPPO on HECOGrid. Our results showthat SAF consistently outperforms the baselines across different tasks and differentheterogeneity and coordination levels.

Chat is not available.