Skip to yearly menu bar Skip to main content


Poster Sat, Apr 25, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 4 P4-#5202

R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Yongchao Chen ⋅ Yueying Liu ⋅ Junwei Zhou ⋅ Yilun Hao ⋅ Jingquan Wang ⋅ Yang Zhang ⋅ Na Li ⋅ Chuchu Fan

Abstract

Log in and register to view live content