Poster
|
Mon 17:00
|
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry
Ziyi Chen · Yi Zhou · Tengyu Xu · Yingbin Liang
|
|
Poster
|
Wed 17:00
|
Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation
Tanner Fiez · Lillian J Ratliff
|
|
Oral
|
Thu 0:30
|
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda · Taiji Suzuki
|
|
Spotlight
|
Wed 5:15
|
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods
Taiji Suzuki · Akiyama Shunta
|
|
Poster
|
Wed 1:00
|
Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Zeyuan Allen-Zhu · Faeze Ebrahimianghazani · Jerry Li · Dan Alistarh
|
|
Poster
|
Thu 9:00
|
Linear Last-iterate Convergence in Constrained Saddle-point Optimization
Chen-Yu Wei · Chung-Wei Lee · Mengxiao Zhang · Haipeng Luo
|
|
Poster
|
Tue 9:00
|
On the Origin of Implicit Regularization in Stochastic Gradient Descent
Samuel Smith · Benoit Dherin · David Barrett · Soham De
|
|
Poster
|
Mon 9:00
|
Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks
Timothy Castiglia · Anirban Das · Stacy Patterson
|
|
Poster
|
Mon 9:00
|
On the Impossibility of Global Convergence in Multi-Loss Optimization
Alistair Letcher
|
|
Poster
|
Mon 17:00
|
When does preconditioning help or hurt generalization?
Shun-ichi Amari · Jimmy Ba · Roger Grosse · Xuechen Li · Atsushi Nitanda · Taiji Suzuki · Denny Wu · Ji Xu
|
|
Spotlight
|
Wed 20:20
|
Understanding the role of importance weighting for deep learning
Da Xu · Yuting Ye · Chuanwei Ruan
|
|
Poster
|
Mon 1:00
|
On the Universality of the Double Descent Peak in Ridgeless Regression
David Holzmüller
|
|