Toggle Poster Visibility
Oral
Fri Apr 24 06:30 AM -- 06:40 AM (PDT) None
The Polar Express: Optimal Matrix Sign Methods and their Application to the Muon Algorithm
[
OpenReview]
Oral
Fri Apr 24 06:42 AM -- 06:52 AM (PDT) None
Temporal superposition and feature geometry of RNNs under memory demands
[
OpenReview]
Oral
Fri Apr 24 06:54 AM -- 07:04 AM (PDT) None
Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime
[
OpenReview]
Oral
Fri Apr 24 07:06 AM -- 07:16 AM (PDT) None
Efficient Resource-Constrained Training of Vision Transformers via Subspace Optimization
[
OpenReview]
Oral
Fri Apr 24 07:18 AM -- 07:28 AM (PDT) None
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
[
OpenReview]
Oral
Fri Apr 24 07:30 AM -- 07:40 AM (PDT) None
HATSolver: Learning Gröbner Bases with Hierarchical Attention Transformers
[
OpenReview]
Successful Page Load