Poster Thu, Apr 23, 2026 • 11:15 AM – 1:45 PM PDT

MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

Hongli Yu · Tinghong Chen · Jiangtao Feng · Jiangjie Chen · Weinan Dai · Qiying Yu · Ya-Qin Zhang · Wei-Ying Ma · Jingjing Liu · Mingxuan Wang · Hao Zhou

Abstract

Despite improvements by length extrapolation, efficient attention and memory modules, handling infinitely long documents without performance degradation during extrapolation remains the ultimate challenge in long-text processing. To solve this problem, We introduce a novel agent workflow, \method, which processes text in segments and updates memory through an overwrite strategy, addressing the challenge of long-context task through enhanced memory management. We further extend the DAPO algorithm to directly optimize memory ability in an end-to-end fashion, facilitating training via independent-context multi-conversation generation. Experimental results demonstrate that MemAgent has superb long-context capabilities, being able to extrapolate from an 8K context to a 3.5M QA task with a performance loss of less than 10\% and achieving over 95\% on the 512K NIAH test.