Learning by Directional Gradient Descent

David Silver · Anirudh Goyal · Ivo Danihelka · Matteo Hessel · Hado van Hasselt

Keywords: [ credit assignment ] [ recurrent networks ]

[ Abstract ]
[ Visit Poster at Spot C2 in Virtual World ] [ OpenReview
Mon 25 Apr 10:30 a.m. PDT — 12:30 p.m. PDT


How should state be constructed from a sequence of observations, so as to best achieve some objective? Most deep learning methods update the parameters of the state representation by gradient descent. However, no prior method for computing the gradient is fully satisfactory, for example consuming too much memory, introducing too much variance, or adding too much bias. In this work, we propose a new learning algorithm that addresses these limitations. The basic idea is to update the parameters of the representation by using the directional derivative along a candidate direction, a quantity that may be computed online with the same computational cost as the representation itself. We consider several different choices of candidate direction, including random selection and approximations to the true gradient, and investigate their performance on several synthetic tasks.

Chat is not available.