Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Lifelong Agents: Learning, Aligning, Evolving
Sun, Apr 26, 2026 • 6:00 AM – 7:00 AM PDT

CoDaPO: Confidence and Difficulty-Adaptive Policy Optimization for LLM Reasoning

(Andrew) Zhanke Zhou ⋅ Xiangyu Lu ⋅ Chentao Cao ⋅ Brando Miranda ⋅ Tongliang Liu ⋅ Bo Han ⋅ Sanmi Koyejo

Abstract

Chat is not available.