Skip to yearly menu bar Skip to main content


Poster Fri, Apr 24, 2026 • 11:15 AM – 1:45 PM PDT Pavilion 3 P3-#2014

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Bo Liu ⋅ Simon Yu ⋅ Zichen Liu ⋅ Leon Guertler ⋅ Penghui Qi ⋅ Daniel Balcells ⋅ Mickel Liu ⋅ Cheston Tan ⋅ Weiyan Shi ⋅ Min Lin ⋅ Wee Sun Lee ⋅ Natasha Jaques

Abstract

Log in and register to view live content