firstbacksecondback
22 Results
Poster
|
Wed 7:30 |
ARGS: Alignment as Reward-Guided Search Maxim Khanov · Jirayu Burapacheep · Yixuan Li |
|
Affinity Workshop
|
Fri 1:45 |
Reward Bound for Behavioral Guarantee of Model-based Planning Agents Zhiyu An · Xianzhong Ding · Wan Du |
|
Workshop
|
Bayesian reward models for LLM alignment Adam Yang · Maxime Robeyns · Thomas Coste · Jun Wang · Haitham Bou Ammar · Laurence Aitchison |
||
Workshop
|
Bayesian reward models for LLM alignment Adam Yang · Maxime Robeyns · Thomas Coste · Jun Wang · Haitham Bou Ammar · Laurence Aitchison |
||
Poster
|
Wed 7:30 |
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models Shuai Zhao · Xiaohan Wang · Linchao Zhu · Yi Yang |
|
Poster
|
Thu 7:30 |
SemiReward: A General Reward Model for Semi-supervised Learning Siyuan Li · Weiyang Jin · Zedong Wang · Fang Wu · Zicheng Liu · Cheng Tan · Stan Z Li |
|
Poster
|
Tue 1:45 |
Reward Model Ensembles Help Mitigate Overoptimization Thomas Coste · Usman Anwar · Robert Kirk · David Krueger |
|
Poster
|
Wed 1:45 |
Tool-Augmented Reward Modeling Lei Li · Yekun Chai · Shuohuan Wang · Yu Sun · Hao Tian · Ningyu Zhang · hua wu |
|
Poster
|
Wed 1:45 |
The Trickle-down Impact of Reward Inconsistency on RLHF Lingfeng Shen · Lingfeng Shen · Sihao Chen · Linfeng Song · Lifeng Jin · Baolin Peng · Haitao Mi · Daniel Khashabi · Dong Yu |
|
Poster
|
Thu 1:45 |
DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing Vint Lee · Pieter Abbeel · Youngwoon Lee |
|
Poster
|
Tue 1:45 |
Directly Fine-Tuning Diffusion Models on Differentiable Rewards Kevin Clark · Paul Vicol · Kevin Swersky · David Fleet |
|
Poster
|
Thu 1:45 |
SALMON: Self-Alignment with Instructable Reward Models Zhiqing Sun · Yikang Shen · Hongxin Zhang · Qinhong Zhou · Zhenfang Chen · David Cox · Yiming Yang · Chuang Gan |