Skip to yearly menu bar Skip to main content


Poster

Training Language Models to Self-Correct via Reinforcement Learning

Aviral Kumar · Vincent Zhuang · Rishabh Agarwal · Yi Su · JD Co-Reyes · Avi Singh · Kate Baumli · Shariq Iqbal · Colton Bishop · Rebecca Roelofs · Lei Zhang · Kay McKinney · Disha Shrivastava · Cosmin Paduraru · George Tucker · Doina Precup · Feryal Behbahani · Aleksandra Faust
2025 Poster

Abstract

Video

Chat is not available.