A Non-Parametric Approach to Dynamic Programming

Oliver B. Kroemer; Jan R. Peters

2011 NIPS NeurIPS 2011

A Non-Parametric Approach to Dynamic Programming

Abstract

In this paper, we consider the problem of policy evaluation for continuous-state systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin's method. Furthermore, we also present a unified view of several well-known policy evaluation methods. In particular, we show that the same Galerkin method can be used to derive Least-Squares Temporal Difference learning, Kernelized Temporal Difference learning, and a discrete-state Dynamic Programming solution, as well as our proposed method. In a numerical evaluation of these algorithms, the proposed approach performed better than the other methods.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Value Iteration

🧭 Keyword Pioneer — galerkin method

🐣 Hot Topic Early Bird — temporal difference learning

Authors

Oliver B. Kroemer , Jan R. Peters

Topics

Machine Learning > Core Methods > Regression Machine Learning > Core Methods > Representation Learning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Mathematics & Optimization > Optimization > Continuous Optimization Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Core Methods > Kernel Methods Reinforcement Learning > Methods > Value Iteration

Keywords

temporal difference learning policy evaluation dynamic programming value function approximation kernel density estimation galerkin method non-parametric method kernel methods

Download PDF

Related papers

Co-Training for Domain Adaptation 2011

The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning 2011

Learning to Agglomerate Superpixel Hierarchies 2011

A Reinforcement Learning Theory for Homeostatic Regulation 2011

A Global Structural EM Algorithm for a Model of Cancer Progression 2011