2012 NIPS NeurIPS 2012

Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Abstract

The expected return is a widely used objective in decision making under uncer- tainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw con- nections to previously proposed objectives for risk-aware planing: minmax, ex- ponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for op- timizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning
📈 Trend Setter — Risk Management
🧭 Keyword Pioneer — risk-aware planning
🐣 Hot Topic Early Bird — stochastic optimization
🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics
🌱 Topic Pioneer — Risk Management