Skip to yearly menu bar Skip to main content


Poster

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

Shicong Cen ⋅ Jincheng Mei ⋅ Katayoon Goshvadi ⋅ Hanjun Dai ⋅ Tong Yang ⋅ Sherry Yang ⋅ Dale Schuurmans ⋅ Yuejie Chi ⋅ Bo Dai
2025 Poster

Abstract

Video

Chat is not available.