Poster
in
Workshop: ICLR 2026 Workshop on Memory for LLM-Based Agentic Systems (MemAgents) Mon, Apr 27, 2026 • 5:50 AM – 6:35 AM PDT

Learning What to Learn: Curriculum Curation for Test-Time Agent Learning

Qizheng Zhang ⋅ Sherry Ruan ⋅ Shubhangi Upasani ⋅ Fenglu Hong ⋅ Changxiu Ji ⋅ Changran Hu ⋅ Bo Li ⋅ Hanchen Li ⋅ Kunle Olukotun

Project Page [ OpenReview]

Abstract

Test-time learning enables large language model (LLM) agents to adapt during inference without costly retraining, yet prior work largely treats test-time experience as equally useful. We ask a simple question: *what data should agents learn from at test time?* Focusing on task selection and ordering for context-based adaptation, we hypothesize that redundant or overly simple examples offer diminishing returns, while curated curricula improve sample efficiency. Using the Agentic Context Engineering (ACE) framework, we evaluate on the AppWorld benchmark featuring tool-use and coding agents. We show that careful data selection can match full-dataset performance using only $\sim$30\% of training tasks, and that task ordering measurably affects learning outcomes. Our results position curriculum curation as a first-class design dimension for efficient test-time agent learning and practical deployment.

Chat is not available.