Skip to yearly menu bar Skip to main content


Poster

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Bingrui Li ⋅ Wei Huang ⋅ Andi Han ⋅ Zhanpeng Zhou ⋅ Taiji Suzuki ⋅ Jun Zhu ⋅ Jianfei Chen
2025 Poster

Abstract

Video

Chat is not available.