Stable Dual Dynamic Programming

Tao Wang; Michael Bowling; Dale Schuurmans; Daniel J. Lizotte

2007 NIPS NeurIPS 2007

Stable Dual Dynamic Programming

Abstract

Recently, we have introduced a novel approach to dynamic programming and re- inforcement learning that is based on maintaining explicit representations of sta- tionary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — dynamic programming

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Value Iteration

Authors

Tao Wang , Michael Bowling , Dale Schuurmans , Daniel J. Lizotte

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Reinforcement Learning > Methods > Value Iteration

Keywords

reinforcement learning function approximation value function dynamic programming stationary distribution dual dynamic programming

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007