Skip to yearly menu bar Skip to main content


Poster

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention

Yuandong Tian · Yiping Wang · Zhenyu Zhang · Beidi Chen · Simon Du
2024 Poster

Abstract

Video

Chat is not available.