Skip to yearly menu bar Skip to main content


Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms

Xuerui Su ⋅ Yue Wang ⋅ Jinhua Zhu ⋅ Mingyang Yi ⋅ Feng Xu ⋅ Zhi-Ming Ma ⋅ Yuting Liu

Abstract

Chat is not available.