Regret Bounds for Risk-Sensitive Reinforcement Learning

Osbert Bastani; Jason Yecheng Ma; Estelle Shen; Wanqiao Xu

2022 NIPS NeurIPS 2022

Regret Bounds for Risk-Sensitive Reinforcement Learning

Abstract

In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — tail outcome

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Osbert Bastani , Jason Yecheng Ma , Estelle Shen , Wanqiao Xu

Topics

Artificial Intelligence > Core AI > AI Safety Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Robust Optimization Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

markov decision process regret bound risk-sensitive reinforcement learning conditional value at risk tail outcome

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022