firstbacksecondback
9 Results
Poster
|
Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping Jiyan He · Xuechen Li · Da Yu · Huishuai Zhang · Janardhan Kulkarni · Yin Tat Lee · Arturs Backurs · Nenghai Yu · Jiang Bian |
||
Poster
|
Wed 7:30 |
A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta. Maksim Velikanov · Denis Kuznedelev · Dmitry Yarotsky |
|
Poster
|
Mon 2:30 |
A new characterization of the edge of stability based on a sharpness measure aware of batch gradient distribution Sungyoon Lee · Cheongjae Jang |
|
Poster
|
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini · Sejun Park · Manuela Girotti · Ioannis Mitliagkas · Murat A Erdogdu |
||
Poster
|
Disentangling the Mechanisms Behind Implicit Regularization in SGD Zachary Novack · Simran Kaur · Tanya Marwah · Saurabh Garg · Zachary Lipton |
||
Poster
|
Mon 2:30 |
Why (and When) does Local SGD Generalize Better than SGD? Xinran Gu · Kaifeng Lyu · Longbo Huang · Sanjeev Arora |
|
Poster
|
Improved Convergence of Differential Private SGD with Gradient Clipping Huang Fang · Xiaoyun Li · Chenglin Fan · Ping Li |
||
Workshop
|
Thu 4:00 |
LOOPED TRANSFORMERS AS PROGRAMMABLE COMPUTERS Angeliki Giannou · Shashank Rajput · Jy-yong Sohn · Kangwook Lee · Jason Lee · Dimitris Papailiopoulos |
|
Poster
|
Tue 2:30 |
Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be Frederik Kunstner · Jacques Chen · Jonathan Lavington · Mark Schmidt |