Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

13 Results

<<   <   Page 1 of 2   >   >>
Poster
Thu 1:45 Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan · Jianye HAO · Yi Ma · Zibin Dong · Hebin Liang · Jinyi Liu · Zhixin Feng · Kai Zhao · YAN ZHENG
Workshop
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong · Hanze Dong · Chenlu Ye · Ziqi Wang · Han Zhong · Heng Ji · Nan Jiang · Tong Zhang
Workshop
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong · Hanze Dong · Chenlu Ye · Ziqi Wang · Han Zhong · Heng Ji · Nan Jiang · Tong Zhang
Affinity Workshop
Thu 7:30 RLHF without RL
Mischa Panchenko
Affinity Workshop
Wed 7:30 The N Implementation Details of RLHF with PPO
Shengyi Huang · Tianlin Liu · Leandro Von Werra
Affinity Workshop
Thu 7:30 Policy Optimization in RLHF: The Impact of Out-of-preference Data
Ziniu Li · Tian Xu · Yang Yu
Poster
Wed 1:45 The Trickle-down Impact of Reward Inconsistency on RLHF
Lingfeng Shen · Lingfeng Shen · Sihao Chen · Linfeng Song · Lifeng Jin · Baolin Peng · Haitao Mi · Daniel Khashabi · Dong Yu
Affinity Workshop
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Ziniu Li · Tian Xu · Yang Yu
Poster
Tue 7:30 Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz · Aaditya Singh · DJ Strouse · Tuomas Sandholm · Ruslan Salakhutdinov · Anca Dragan · Stephen McAleer
Poster
Wed 1:45 Safe RLHF: Safe Reinforcement Learning from Human Feedback
Juntao Dai · Xuehai Pan · Ruiyang Sun · Jiaming Ji · Xinbo Xu · Mickel Liu · Yizhou Wang · Yaodong Yang
Poster
Wed 1:45 Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk · Ishita Mediratta · Christoforos Nalmpantis · Jelena Luketina · Eric Hambro · Edward Grefenstette · Roberta Raileanu
Workshop
Peering Through Preferences: Unraveling Feedback Acquisition for Aligning Large Language Models
Hritik Bansal · John Dang · Aditya Grover