Skip to yearly menu bar Skip to main content


Poster

FlowRL: Matching Reward Distributions for LLM Reasoning

Xuekai Zhu · Daixuan Cheng · Dinghuai Zhang · Henry Li · Kaiyan Zhang · Che Jiang · Youbang Sun · Ermo Hua · Yuxin Zuo · Xingtai Lv · Qizheng Zhang · Lin Chen · Fanghao Shao · Bo Xue · Yunchong Song · Zhenjie Yang · Ganqu Cui · Ning Ding · Jianfeng Gao · Xiaodong Liu · Bowen Zhou · Hongyuan Mei · Zhouhan Lin

Abstract

Log in and register to view live content