RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Marek Petrik; Dharmashankar Subramanian

2014 NIPS NeurIPS 2014

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Abstract

We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the gamma-discounted infinite horizon performance loss by a factor of 1/(1-gamma) while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

📈 Trend Setter — Robustness

Authors

Marek Petrik , Dharmashankar Subramanian

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Domain Adaptation Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Learning Types > Deep Learning Machine Learning > Core Methods > Optimization Machine Learning > Learning Types > Robustness

Keywords

reinforcement learning robust optimization markov decision process value function approximation approximate dynamic programming fitted value iteration robust markov decision processes state aggregation robust markov decision process

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014