Skip to yearly menu bar Skip to main content


Poster

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Runze Liu · Jiakang Wang · Yuling Shi · Zhihui Xie · Chenxin An · Kaiyan Zhang · Jian Zhao · Xiaodong Gu · Lei Lin · Wenping Hu · Xiu Li · Fuzheng Zhang · Guorui Zhou · Kun Gai

Abstract

Log in and register to view live content