Skip to yearly menu bar Skip to main content


Spotlight Poster

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Juntao Dai · Xuehai Pan · Ruiyang Sun · Jiaming Ji · Xinbo Xu · Mickel Liu · Yizhou Wang · Yaodong Yang
2024 Spotlight Poster

Abstract

Video

Chat is not available.