Risk-Conditioned Reinforcement Learning: A Generalized Approach for Adapting to Varying Risk Measures

Gwangpyo Yoo; Jinwoo Park; Honguk Woo

2024 AAAI AAAI 2024

Risk-Conditioned Reinforcement Learning: A Generalized Approach for Adapting to Varying Risk Measures

Abstract

Abstract In application domains requiring mission-critical decision making, such as finance and robotics, the optimal policy derived by reinforcement learning (RL) often hinges on a preference for risk management. Yet, the dynamic nature of risk measures poses considerable challenges to achieving generalization and adaptation of risk-sensitive policies in the context of RL. In this paper, we propose a risk-conditioned RL model that enables rapid policy adaptation to varying risk measures via a unified risk representation, the Weighted Value-at-Risk (WV@R). To sample risk measures that avoid undue optimism, we construct a risk proposal network employing a conditional adversarial auto-encoder and a normalizing flow. This network establishes coherent representations for risk measures, preserving the continuity in terms of the Wasserstein distance on the risk measures. The normalizing flow is used to support non-crossing quantile regression that obtains valid samples for risk measures, and it is also applied to the agent’s critic to ascertain the preservation of monotonicity in quantile estimations. Through experiments with locomotion, finance, and self-driving scenarios, we show that our model is capable of adapting to a range of risk measures, achieving comparable performance to the baseline models individually trained for each measure. Our model often outperforms the baselines, especially in the cases when exploration is required during training but risk-aversion is favored during evaluation.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — risk-conditioned reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gwangpyo Yoo , Jinwoo Park , Honguk Woo

Topics

Machine Learning > Application Areas > Risk Management Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Optimization & Theory > Stochastic Methods

Keywords

reinforcement learning risk management policy learning quantile regression normalizing flow risk-sensitive policy risk-conditioned reinforcement learning weighted value-at-risk conditional adversarial auto-encoder

Download PDF

Related papers

Goal Alignment: Re-analyzing Value Alignment Problems Using Human-Aware AI 2024

Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables 2024

Suppressing Uncertainty in Gaze Estimation 2024

Mask-Homo: Pseudo Plane Mask-Guided Unsupervised Multi-Homography Estimation 2024

Heterogeneous Test-Time Training for Multi-Modal Person Re-identification 2024