Skip to yearly menu bar Skip to main content


Poster

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention

Yuandong Tian ⋅ Yiping Wang ⋅ Zhenyu Zhang ⋅ Beidi Chen ⋅ Simon Du
2024 Poster

Abstract

Video

Chat is not available.