Skip to yearly menu bar Skip to main content


In-Person Poster presentation / top 25% paper

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

Tianlong Chen ⋅ Zhenyu Zhang ⋅ AJAY JAISWAL ⋅ Shiwei Liu ⋅ Zhangyang Wang
2023 In-Person Poster presentation / top 25% paper

Abstract

Video

Chat is not available.