Skip to yearly menu bar Skip to main content


Poster Thu, Apr 23, 2026 • 6:30 AM – 9:00 AM PDT

Noisy but Valid: Robust Statistical Evaluation of LLMs with Imperfect Judges

Chen Feng · Minghe Shen · Ananth Balashankar · Carsten Gerner-Beuerle · Miguel Rodrigues

Abstract

Log in and register to view live content