Skip to yearly menu bar Skip to main content


Poster

Rational Decision-Making Agent with Learning Internal Utility Judgment

Yining Ye · Xin Cong · Shizuo Tian · Yujia Qin · Chong Liu · Yankai Lin · Zhiyuan Liu · Maosong Sun

Hall 3 + Hall 2B #230
[ ] [ Project Page ]
Fri 25 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

With remarkable advancements, large language models (LLMs) have attracted significant efforts to develop LLM-based agents capable of executing intricate multi-step decision-making tasks. Existing approaches predominantly build upon the external performance measure to guide the decision-making process but the reliance on the external performance measure as prior is problematic in real-world scenarios, where such prior may be unavailable, flawed, or even erroneous. For genuine autonomous decision-making for LLM-based agents, it is imperative to develop rationality from their posterior experiences to judge the utility of each decision independently. In this work, we propose RaDAgent (Rational Decision-Making Agent), which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning. Within this framework, Elo-based Utility Learning is devised to assign Elo scores to individual decision steps to judge their utilities via pairwise comparisons. Consequently, these Elo scores guide the decision-making process to derive optimal outcomes. Experimental results on the Game of 24, WebShop, ToolBench and RestBench datasets demonstrate RaDAgent’s superiority over baselines, achieving about 7.8% improvement on average. Besides, RaDAgent also can reduce costs (ChatGPT API calls), highlighting its effectiveness and efficiency.

Live content is unavailable. Log in and register to view live content