Skip to yearly menu bar Skip to main content


Poster

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

Wenhong Zhu ⋅ Zhiwei He ⋅ Xiaofeng Wang ⋅ Pengfei Liu ⋅ Rui Wang
2025 Poster

Abstract

Video

Chat is not available.