Regularized Nonlinear Acceleration (RNA) can improve the rate of convergence of many optimization schemes such as gradient descent, SAGA or SVRG, estimating the optimum using a nonlinear average of past iterates. Until now, its analysis was limited to convex problems, but empirical observations show that RNA may be extended to a broader setting. Here, we investigate the benefits of nonlinear acceleration when applied to the training of neural networks, in particular for the task of image recognition on the CIFAR10 and ImageNet data sets. In our experiments, with minimal modifications to existing frameworks, RNA speeds up convergence and improves testing error on standard CNNs.
Live content is unavailable. Log in and register to view live content