ICLR Poster Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency

Poster

Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency

Qifan Liang · Yixiang Shan · Haipeng Liu · Zhengbang Zhu · Ting Long · Weinan Zhang · Yuan Tian

Hall 3 + Hall 2B #370

[ Abstract ]

Sat 26 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

An important challenge in multi-agent reinforcement learning is partial observability, where agents cannot access the global state of the environment during execution and can only receive observations within their field of view. To address this issue, previous works typically use the dimensional-wise state, which is obtained by applying MLP or dimensional-based attention on the global state, for decision-making during training and relying on a reconstructed dimensional-wise state during execution. However, dimensional-wise states tend to divert agent attention to specific features, neglecting potential dependencies between agents, making it difficult to make optimal decisions. Moreover, the inconsistency between the states used in training and execution further increases additional errors. To resolve these issues, we propose a method called Reconstruction-Guided Policy (RGP) to reconstruct the agent-wise state, which represents the information of inter-agent relationships, as input for decision-making during both training and execution. This not only preserves the potential dependencies between agents but also ensures consistency between the states used in training and execution. We conducted extensive experiments on both discrete and continuous action environments to evaluate RGP, and the results demonstrates its superior effectiveness. Our code is public in https://anonymous.4open.science/r/RGP-9F79

Live content is unavailable. Log in and register to view live content