Skip to yearly menu bar Skip to main content


RMBoost: Reward Model Training With Preference-Conditional Multi-Aspect Synthetic Data Generation

Jiaming Shen ⋅ Ran Xu ⋅ Yennie Jun ⋅ Zhen Qin ⋅ Tianqi Liu ⋅ Carl Yang ⋅ Yi Liang ⋅ Simon Baumgartner ⋅ Michael Bendersky

Abstract

Chat is not available.