Skip to yearly menu bar Skip to main content


Poster

Reward Model Ensembles Help Mitigate Overoptimization

Thomas Coste ⋅ Usman Anwar ⋅ Robert Kirk ⋅ David Krueger
2024 Poster

Abstract

Video

Chat is not available.