Skip to yearly menu bar Skip to main content


Poster Fri, Apr 24, 2026 • 11:15 AM – 1:45 PM PDT Pavilion 4 P4-#4607

FlowRL: Matching Reward Distributions for LLM Reasoning

Xuekai Zhu ⋅ Daixuan Cheng ⋅ Dinghuai Zhang ⋅ Henry Li ⋅ Kaiyan Zhang ⋅ Che Jiang ⋅ Youbang Sun ⋅ Ermo Hua ⋅ Yuxin Zuo ⋅ Xingtai Lv ⋅ Qizheng Zhang ⋅ Lin Chen ⋅ Fanghao Shao ⋅ Bo Xue ⋅ Yunchong Song ⋅ Zhenjie Yang ⋅ Ganqu Cui ⋅ Ning Ding ⋅ Jianfeng Gao ⋅ Xiaodong Liu ⋅ Bowen Zhou ⋅ Hongyuan Mei ⋅ Zhouhan Lin

Abstract

Log in and register to view live content