Skip to yearly menu bar Skip to main content


Poster Sat, Apr 25, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 4 P4-#4609

RiskPO: Risk-based Policy Optimization with Verifiable Reward for LLM Post-Training

Tao Ren ⋅ Jinyang Jiang ⋅ Hui Yang ⋅ Wan Tian ⋅ Minhao Zou ⋅ Guanghao Li ⋅ Zishi Zhang ⋅ Qinghao Wang ⋅ Shentao Qin ⋅ Yanjun Zhao ⋅ Rui Tao ⋅ Hui Shao ⋅ Yijie Peng

Abstract

Log in and register to view live content