Skip to yearly menu bar Skip to main content


Poster Sat, Apr 25, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 4 P4-#4601

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Yuxin Jiang ⋅ Yufei Wang ⋅ Qiyuan Zhang ⋅ Xingshan Zeng ⋅ Liangyou Li ⋅ Jierun Chen ⋅ Chaofan Tao ⋅ Haoli Bai ⋅ Lifeng Shang

Abstract

Log in and register to view live content