Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Catch, Adapt, and Operate: Monitoring ML Models Under Drift
Sun, Apr 26, 2026 • 10:45 AM – 11:30 AM PDT

Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

Yuning Wu ⋅ Ke Wang ⋅ Devin Chen ⋅ Kai Wei

Abstract

Chat is not available.