Skip to yearly menu bar Skip to main content


Invited Talk

The emerging science of benchmarks

Moritz Hardt

Moderator : Mohammad Emtiyaz Khan

Halle A 8 - 9
[ ]
Thu 9 May 11:30 p.m. PDT — 12:30 a.m. PDT

Abstract:

Benchmarks are the keystone that hold the machine learning community together. Growing as a research paradigm since the 1980s, there's much we've done with them, but little we know about them. In this talk, I will trace the rudiments of an emerging science of benchmarks through selected empirical and theoretical observations. Specifically, we'll discuss the role of annotator errors, external validity of model rankings, and the promise of multi-task benchmarks. The results in each case challenge conventional wisdom and underscore the benefits of developing a science of benchmarks.

Live content is unavailable. Log in and register to view live content