Skip to yearly menu bar Skip to main content


Spotlight Poster

Confronting Reward Model Overoptimization with Constrained RLHF

Ted Moskovitz ⋅ Aaditya Singh ⋅ DJ Strouse ⋅ Tuomas Sandholm ⋅ Ruslan Salakhutdinov ⋅ Anca Dragan ⋅ Stephen McAleer
2024 Spotlight Poster

Abstract

Video

Chat is not available.