ICLR Poster BP-Modified Local Loss for Efficient Training of Deep Neural Networks

Poster

BP-Modified Local Loss for Efficient Training of Deep Neural Networks

REN Lianhai · Qianxiao Li

Hall 3 + Hall 2B #122

[ Abstract ]

Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

The training of large models is memory-constrained, one direction to relieve this is training using local loss, like GIM, LoCo, and Forward-Forward algorithms. However, the local loss methods often face the issue of slow or non-convergence. In this paper, we propose a novel BP-modified local loss method that uses the true Backward Propagation (BP) gradient to modify the local loss gradient to improve the performance of local loss training. We use the stochastic modified equation to analyze our method and show that modified offset decreases the bias between the BP gradient and local loss gradient, but introduces additional variance, which results in a bias-variance balance. Numerical experiments on full-tuning and LoKr tuning on the ResNet-50 model and LoRA tuning on the ViT-b16 model on CIFAR-100 datasets show 20.5\% test top-1 accuracy improvement for the Forward-Forward algorithm, 18.6\% improvement for LoCo algorithm and achieve only on average 7.7\% of test accuracy loss compared to the BP algorithm, with up to 75\% memory savings.

Live content is unavailable. Log in and register to view live content