Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

9 Results

<<   <   Page 1 of 1   >>   >
Poster
Wed 7:30 A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta.
Maksim Velikanov · Denis Kuznedelev · Dmitry Yarotsky
Poster
Mon 2:30 A new characterization of the edge of stability based on a sharpness measure aware of batch gradient distribution
Sungyoon Lee · Cheongjae Jang
Poster
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini · Sejun Park · Manuela Girotti · Ioannis Mitliagkas · Murat A Erdogdu
Poster
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Zachary Novack · Simran Kaur · Tanya Marwah · Saurabh Garg · Zachary Lipton
Poster
Mon 2:30 Why (and When) does Local SGD Generalize Better than SGD?
Xinran Gu · Kaifeng Lyu · Longbo Huang · Sanjeev Arora
Workshop
Thu 4:00 LOOPED TRANSFORMERS AS PROGRAMMABLE COMPUTERS
Angeliki Giannou · Shashank Rajput · Jy-yong Sohn · Kangwook Lee · Jason Lee · Dimitris Papailiopoulos
Poster
Tue 2:30 Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be
Frederik Kunstner · Jacques Chen · Jonathan Lavington · Mark Schmidt