$\mathcal{X}$-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

Léonard Torossian; Aurélien Garivier; Victor Picheny

2019 ACML ACML 2019

$\mathcal{X}$-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

Abstract

We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution. We provide a generic regret analysis of StoROO and illustrate its applicability with two examples: the optimization of quantiles and CVaR. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on random-size samples. We detail their construction in the case of quantiles, providing tight bounds based on Kullback-Leibler divergence. We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — risk-averse optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Léonard Torossian , Aurélien Garivier , Victor Picheny

Topics

Machine Learning > Application Areas > Risk Management Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

stochastic optimization black-box optimization quantile optimization conditional value at risk risk-averse optimization

Download PDF

Related papers

An Articulated Structure-aware Network for 3D Human Pose Estimation 2019

Model-Based Reinforcement Learning Exploiting State-Action Equivalence 2019

Unified Policy Optimization for Robust Reinforcement Learning 2019

Zero-shot Domain Adaptation Based on Attribute Information 2019

Exemplar Based Mixture Models with Censored Data 2019