Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

266 Results

<<   <   Page 1 of 23   >   >>
Poster
Wed 1:45 Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando · Florian Tramer
Poster
Wed 1:45 Human Feedback is not Gold Standard
Tom Hosking · Phil Blunsom · Max Bartolo
Poster
Thu 1:45 Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach
Xinwei Zhang · Zhiqi Bu · Steven Wu · Mingyi Hong
Poster
Wed 7:30 Contrastive Preference Learning: Learning from Human Feedback without Reinforcement Learning
Joey Hejna · Rafael Rafailov · Harshit Sikchi · Chelsea Finn · Scott Niekum · W. Bradley Knox · Dorsa Sadigh
Poster
Wed 1:45 Safe RLHF: Safe Reinforcement Learning from Human Feedback
Juntao Dai · Xuehai Pan · Ruiyang Sun · Jiaming Ji · Xinbo Xu · Mickel Liu · Yizhou Wang · Yaodong Yang
Poster
Wed 7:30 Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles
Zhiwei Tang · Dmitry Rybin · Tsung-Hui Chang
Poster
Thu 1:45 Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan · Jianye HAO · Yi Ma · Zibin Dong · Hebin Liang · Jinyi Liu · Zhixin Feng · Kai Zhao · YAN ZHENG
Poster
Thu 7:30 PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
Souradip Chakraborty · Amrit Bedi · Alec Koppel · Huazheng Wang · Dinesh Manocha · Mengdi Wang · Furong Huang
Poster
Wed 1:45 Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma · Katherine Metcalf
Poster
Thu 7:30 The Human-AI Substitution game: active learning from a strategic labeler
Tom Yan · Chicheng Zhang
Workshop
Learning to Abstract Visuomotor Mappings using Meta-Reinforcement Learning
Carlos Velazquez-Vargas · Isaac Christian · Jordan Taylor · Sreejan Kumar
Poster
Tue 7:30 Making RL with Preference-based Feedback Efficient via Randomization
Runzhe Wu · Wen Sun