Skip to yearly menu bar Skip to main content


Poster Fri, Apr 24, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 3 P3-#1614

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

Pierre-Carl Langlais ⋅ Pavel Chizhov ⋅ Catherine Arnett ⋅ Carlos Hinostroza ⋅ Mattia Nee ⋅ Eliot Jones ⋅ Irène Girard ⋅ David Mach ⋅ Anastasia Stasenko ⋅ Ivan Yamshchikov

Abstract

Log in and register to view live content