ICLR Poster On the Performance of Temporal Difference Learning With Neural Networks

Virtual presentation / poster accept

On the Performance of Temporal Difference Learning With Neural Networks

Haoxing Tian · Ioannis Paschalidis · Alex Olshevsky

[ Abstract ]

[ Poster] [ OpenReview]

Abstract: Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we provide a convergence analysis of Neural TD Learning with a projection onto

$B(\theta_0, \omega)$ , a ball of fixed radius

$\omega$ around the initial point

$\theta_0$ . We show an approximation bound of

$O(\epsilon + 1/\sqrt{m})$ where

$\epsilon$ is the approximation quality of the best neural network in

$B(\theta_0, \omega)$ and

$m$ is the width of all hidden layers in the network.

Chat is not available.