Skip to yearly menu bar Skip to main content


Poster

Noisy but Valid: Robust Statistical Evaluation of LLMs with Imperfect Judges

Chen Feng · Minghe Shen · Ananth Balashankar · Carsten Gerner-Beuerle · Miguel Rodrigues

Abstract

Log in and register to view live content