Skip to yearly menu bar Skip to main content


Poster

Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization

Juntao Dai · Taiye Chen · Yaodong Yang · Qian Zheng · Gang Pan
2025 Poster

Abstract

Video

Chat is not available.