Skip to yearly menu bar Skip to main content


Poster

When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training

Sanxing Chen · Xiaoyin Chen · Yukun Huang · Roy Xie · Bhuwan Dhingra

Abstract

Log in and register to view live content