Skip to yearly menu bar Skip to main content


Poster

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Tengyang Xie · Dylan Foster · Akshay Krishnamurthy · Corby Rosset · Ahmed H Awadallah · Alexander Rakhlin
2025 Poster

Abstract

Video

Chat is not available.