GraphPlanner: Graph Memory-Augmented Agentic Routing for Multi-Agent LLMs
Abstract
LLM routing has achieved promising results in integrating the strengths of di- verse models while balancing efficiency and performance. However, to support more realistic and challenging applications, routing must extend into agentic LLM settings—where task planning, multi-round cooperation among heterogeneous agents, and memory utilization are indispensable. To address this gap, we pro- pose GraphPlanner, a heterogeneous graph memory-augmented agentic router for multi-agent LLMs that generates routing workflows for each query and sup- ports both inductive and transductive inference. GraphPlanner formulates workflow generation as a Markov Decision Process (MDP), where at each step it selects both the LLM backbone and the agent role (Planner, Executor, Sum- marizer). By leveraging a heterogeneous graph, denoted as GARNet, to capture interaction memories among queries, agents, and responses, GraphPlanner integrates historical memory and workflow memory into richer state represen- tations. The entire pipeline is optimized with reinforcement learning, jointly improving task-specific performance and computational efficiency. We evalu- ate GraphPlanner across 14 diverse LLM tasks and demonstrate that: (1) GraphPlanner outperforms strong single- and multi-round routers, improv- ing accuracy by up to 9.3% while reducing GPU cost from 186.26 GiB to 1.04 GiB; (2) GraphPlanner generalizes robustly to unseen tasks and LLMs, exhibiting strong zero-shot capabilities; and (3) GraphPlanner effectively leverages historical memories, supporting both inductive and transductive infer- ence for more adaptive routing. Our code for GraphPlanner is released at https://github.com/ulab-uiuc/GraphPlanner.