A Class of Short-term Recurrence Anderson Mixing Methods and Their Applications

Fuchao Wei · Chenglong Bao · Yang Liu

Keywords: [ stochastic optimization ] [ nonconvex optimization ]

[ Abstract ]
[ Visit Poster at Spot F0 in Virtual World ] [ OpenReview
Tue 26 Apr 6:30 p.m. PDT — 8:30 p.m. PDT


Anderson mixing (AM) is a powerful acceleration method for fixed-point iterations, but its computation requires storing many historical iterations. The extra memory footprint can be prohibitive when solving high-dimensional problems in a resource-limited machine. To reduce the memory overhead, we propose a novel class of short-term recurrence AM methods (ST-AM). The ST-AM methods only store two previous iterations with cheap corrections. We prove that the basic version of ST-AM is equivalent to the full-memory AM in strongly convex quadratic optimization, and with minor changes it has local linear convergence for solving general nonlinear fixed-point problems. We further analyze the convergence properties of the regularized ST-AM for nonconvex (stochastic) optimization. Finally, we apply ST-AM to several applications including solving root-finding problems and training neural networks. Experimental results show that ST-AM is competitive with the long-memory AM and outperforms many existing optimizers.

Chat is not available.