Skip to yearly menu bar Skip to main content


Poster

On the Optimization and Generalization of Multi-head Attention

Christos Thrampoulidis ⋅ Rouzbeh Ghaderi ⋅ Hossein Taheri ⋅ Puneesh Deora
2025 Poster

Abstract

Video

Chat is not available.