Virtual presentation / poster accept
On the Performance of Temporal Difference Learning With Neural Networks
Haoxing Tian · Ioannis Paschalidis · Alex Olshevsky
Keywords: [ Reinforcement Learning ]
Abstract:
Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we provide a convergence analysis of Neural TD Learning with a projection onto B(θ0,ω), a ball of fixed radius ω around the initial point θ0. We show an approximation bound of O(ϵ+1/√m) where ϵ is the approximation quality of the best neural network in B(θ0,ω) and m is the width of all hidden layers in the network.
Chat is not available.