On the Performance of Temporal Difference Learning With Neural Networks

Haoxing Tian; Ioannis Paschalidis; Alex Olshevsky

2023 ICLR ICLR 2023

On the Performance of Temporal Difference Learning With Neural Networks

Abstract

Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we provide a convergence analysis of Neural TD Learning with a projection onto $B(\theta_0, \omega)$, a ball of fixed radius $\omega$ around the initial point $\theta_0$. We show an approximation bound of $O(\epsilon + 1/\sqrt{m})$ where $\epsilon$ is the approximation quality of the best neural network in $B(\theta_0, \omega)$ and $m$ is the width of all hidden layers in the network.

Authors

Haoxing Tian , Ioannis Paschalidis , Alex Olshevsky

Download PDF

Related papers

Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis 2023

Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach 2023

Bispectral Neural Networks 2023

Fundamental limits on the robustness of image classifiers 2023

A VAE for Transformers with Nonparametric Variational Information Bottleneck 2023