Self-Improving Foundation Models Without Human Supervision

Workshop

Self-Improving Foundation Models Without Human Supervision

Amrith Setlur · Katie Kang · Aviral Kumar · Feryal Behbahani · Roberta Raileanu · Rishabh Agarwal

Garnet 214-215

Sat 26 Apr, 6 p.m. PDT

[ Abstract ]

As foundation models (FMs) scale, they face a data bottleneck, where the growth of high-quality internet data unable to keep pace with their training needs. This is most apparent with text data already, has been a consistent problem in domains such as embodied intelligence, and is expected to soon inflict other modalities as well. Self-improvement, a paradigm where models generate and train on synthetic data generated from the same or other models, offers a promising solution. This paradigm differs from both supervised learning, which relies on curated human data, and reinforcement learning (RL), which depends on external rewards. Self-improvement frameworks require models to self-curate training data, often using imperfect learned verifiers, with unique challenges. This workshop will explore algorithms for self-improvement, covering topics such as synthetic data, multi-agent and multi-modal systems, weak-to-strong generalization, inference-time self-supervision, and theoretical limits.

Live content is unavailable. Log in and register to view live content

Timezone: America/Los_Angeles

Main Navigation

Workshop