Skip to yearly menu bar Skip to main content


AMPO: Active Multi Preference Optimization for Self-play Preference Selection

Taneesh Gupta ⋅ Rahul Madhavan ⋅ Xuchao Zhang ⋅ Chetan Bansal ⋅ Saravanakumar Rajmohan

Abstract

Chat is not available.