Skip to yearly menu bar Skip to main content


Poster

Linear attention is (maybe) all you need (to understand Transformer optimization)

Kwangjun Ahn ⋅ Xiang Cheng ⋅ Minhak Song ⋅ Chulhee Yun ⋅ Ali Jadbabaie ⋅ Suvrit Sra
2024 Poster

Abstract

Video

Chat is not available.