Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Teodor M. Moldovan; Pieter Abbeel

2012 NIPS NeurIPS 2012

Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Abstract

The expected return is a widely used objective in decision making under uncer- tainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw con- nections to previously proposed objectives for risk-aware planing: minmax, ex- ponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efﬁcient algorithm for op- timizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

📈 Trend Setter — Risk Management

🧭 Keyword Pioneer — risk-aware planning

🐣 Hot Topic Early Bird — stochastic optimization

🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

🌱 Topic Pioneer — Risk Management

Authors

Teodor M. Moldovan , Pieter Abbeel

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Risk Management Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Value Iteration Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Learning Types > Reinforcement Learning Reinforcement Learning > Methods > Value Iteration Artificial Intelligence > Core AI > Risk Management

Keywords

stochastic optimization reinforcement learning markov decision processes decision making risk-aware planning markov decision process value iteration risk aversion risk averse optimization optimization under uncertainty chernoff bound

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012