2021
AAAI
AAAI 2021
Near-Optimal MNL Bandits Under Risk Criteria
Abstract
Abstract We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and business. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio, and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization and Reinforcement Learning
🧭
Keyword Pioneer
— entropy risk
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy
Authors
Topics
Machine Learning > Optimization & Theory > Stochastic Processes
Machine Learning > Application Areas > Risk Management
Reinforcement Learning > Applications
Mathematics & Optimization > Optimization > Online Algorithms
Machine Learning > Learning Types > Multi-Armed Bandits
Machine Learning > Learning Types > Risk Management