Skip to yearly menu bar Skip to main content


Poster

Training Language Models to Self-Correct via Reinforcement Learning

Aviral Kumar ⋅ Vincent Zhuang ⋅ Rishabh Agarwal ⋅ Yi Su ⋅ JD Co-Reyes ⋅ Avi Singh ⋅ Kate Baumli ⋅ Shariq Iqbal ⋅ Colton Bishop ⋅ Rebecca Roelofs ⋅ Lei Zhang ⋅ Kay McKinney ⋅ Disha Shrivastava ⋅ Cosmin Paduraru ⋅ George Tucker ⋅ Doina Precup ⋅ Feryal Behbahani ⋅ Aleksandra Faust
2025 Poster

Abstract

Video

Chat is not available.