Skip to yearly menu bar Skip to main content


Poster

Small-scale proxies for large-scale Transformer training instabilities

Mitchell Wortsman ⋅ Peter Liu ⋅ Lechao Xiao ⋅ Katie Everett ⋅ Alexander Alemi ⋅ Ben Adlam ⋅ John Co-Reyes ⋅ Izzeddin Gur ⋅ Abhishek Kumar ⋅ Roman Novak ⋅ Jeffrey Pennington ⋅ Jascha Sohl-Dickstein ⋅ Kelvin Xu ⋅ Jaehoon Lee ⋅ Justin Gilmer ⋅ Simon Kornblith
2024 Poster

Abstract

Video

Chat is not available.