Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Self-Improving Foundation Models Without Human Supervision

A Self-Improving Coding Agent

Maxime Robeyns · Martin Szummer · Laurence Aitchison

Keywords: [ agents ] [ LLM ] [ coding ] [ self-improvement ]


Abstract:

Recent advancements in Large Language Models (LLMs) have spurred interest in deploying LLM agents to undertake tasks in the world. LLMs are often deployed in agent systems: code that orchestrates LLM calls and provides them with tools. We demonstrate that an agent system, equipped with basic coding tools, can autonomously edit itself, and thereby improve its performance on benchmark tasks. We find performance gains from 17% to 53% on a random subset of SWE Bench Verified, with additional performance gains on LiveCodeBench, as well as synthetically generated agent benchmarks. Our work represents an advancement in the automated and open-ended design of agentic systems, and provides a reference agent framework for those seeking to post-train LLMs on tool use and other agentic tasks.

Chat is not available.