Skip to yearly menu bar Skip to main content


Poster Session #1
in
Workshop: Workshop on Scaling Post-training for LLMs (SPOT)
Mon, Apr 27, 2026 • 7:35 AM – 8:20 AM PDT

Near-Optimal Regret for KL-Regularized Multi-Armed Bandits

Kaixuan Ji ⋅ Qingyue Zhao ⋅ Heyang Zhao ⋅ Qiwei Di ⋅ Quanquan Gu

Abstract

Chat is not available.