Skip to yearly menu bar Skip to main content


Poster

Unifying Stable Optimization and Reference Regularization in RLHF

Li He · Qiang Qu · He Zhao · Stephen Wan · Dadong Wang · Lina Yao · Tongliang Liu

Abstract

Log in and register to view live content