Skip to yearly menu bar Skip to main content


Poster

GneissWeb: Preparing High Quality Data for LLMs at Scale

Hajar Emami Gohari · Swanand Kadhe · Yousaf Shah · Constantin Adam · Abdulhamid Adebayo · Praneet Adusumilli · Farhan Ahmed · Nathalie Baracaldo · Santosh Borse · Yuan-Chi Chang · Xuan-Hong Dang · Nirmit Desai · Revital Eres · Ran Iwamoto · Alexei Karve · Yan Koyfman · Wei-Han Lee · Changchang Liu · Boris Lublinsky · Takuya Ohko · Pablo Pesce · Maroun Touma · Shiqiang Wang · Shalisha Witherspooon · Herbert Woisetschläger · David Wood · Kun-Lung Wu · Issei Yoshida · Syed Zawad · Petros Zerfos · Yi Zhou · Bishwaranjan Bhattacharjee

Abstract

Log in and register to view live content