2007 NIPS NeurIPS 2007

Stable Dual Dynamic Programming

Abstract

Recently, we have introduced a novel approach to dynamic programming and re- inforcement learning that is based on maintaining explicit representations of sta- tionary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning
🧭 Keyword Pioneer — dynamic programming
🐣 Hot Topic Early Bird — reinforcement learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
📈 Trend Setter — Value Iteration