Skip to yearly menu bar Skip to main content


Poster Thu, Apr 23, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 4 P4-#4610

RM-R1: Reward Modeling as Reasoning

Xiusi Chen ⋅ Gaotang Li ⋅ Ziqi Wang ⋅ Bowen Jin ⋅ Cheng Qian ⋅ Yu Wang ⋅ Hongru WANG ⋅ Yu Zhang ⋅ Denghui Zhang ⋅ Tong Zhang ⋅ Hanghang Tong ⋅ Heng Ji

Abstract

Log in and register to view live content