The Robustness-Performance Tradeoff in Markov Decision Processes

Huan Xu; Shie Mannor

2006 NIPS NeurIPS 2006

The Robustness-Performance Tradeoff in Markov Decision Processes

Abstract

Computation of a satisfactory control policy for a Markov decision process when the parameters of the model are not exactly known is a problem encountered in many practical applications. The traditional robust approach is based on a worstcase analysis and may lead to an overly conservative policy. In this paper we consider the tradeoff between nominal performance and the worst case performance over all possible models. Based on parametric linear programming, we propose a method that computes the whole set of Pareto efficient policies in the performancerobustness plane when only the reward parameters are subject to uncertainty. In the more general case when the transition probabilities are also subject to error, we show that the strategy with the "optimal" tradeoff might be non-Markovian and hence is in general not tractable.

🚀 Conference Pioneer — NIPS 2006

🌱 Topic Pioneer — Agent Systems

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

📈 Trend Setter — Agent Systems

🧭 Keyword Pioneer — reinforcement learning

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Huan Xu , Shie Mannor

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Reinforcement Learning > Methods > Deep RL Mathematics & Optimization > Optimization > Continuous Optimization Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization Mathematics & Optimization > Optimization > Multi-Objective Optimization Machine Learning > Learning Types > Robustness

Keywords

reinforcement learning robust optimization policy optimization markov decision processes robustness performance tradeoff pareto efficiency markov decision process parametric linear programming worst-case analysis linear programming worst-case performance

Download PDF

Related papers

Temporal Coding using the Response Properties of Spiking Neurons 2006

Parameter Expanded Variational Bayesian Methods 2006

Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning 2006

Ordinal Regression by Extended Binary Classification 2006

Blind source separation for over-determined delayed mixtures 2006