Reducing Exploration of Dying Arms in Mortal Bandits

Stefano Tracà; Cynthia Rudin; Weiyu Yan

2019 UAI UAI 2019

Reducing Exploration of Dying Arms in Mortal Bandits

Abstract

Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem showed how to regulate exploration of new arms when they have recently appeared, but they do not adapt when the arms are about to disappear. Since in most applications we can determine either exactly or approximately when arms will disappear, we can leverage this information to improve performance: we should not be exploring arms that are about to disappear. We provide adaptations of algorithms, regret bounds, and experiments for this study, showing a clear benefit from regulating greed (exploration/exploitation) for arms that will soon disappear. We illustrate numerical performance on the Yahoo! Front Page Today Module User Click Log Dataset.

🚀 Conference Pioneer — UAI 2019

🌉 Interdisciplinary Bridge — Data Science & Analytics and Machine Learning

🧭 Keyword Pioneer — mortal bandit

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Stefano Tracà , Cynthia Rudin , Weiyu Yan

Topics

Data Science & Analytics > Applications > Recommender Systems Machine Learning > Learning Types > Online Learning Machine Learning > Optimization & Theory > Online Algorithms Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

online learning exploration exploitation regret bound recommendation system mortal bandit

Download PDF

Related papers

Fisher-Bures Adversary Graph Convolutional Networks 2019

Augmenting and Tuning Knowledge Graph Embeddings 2019

Learning Factored Markov Decision Processes with Unawareness 2019

Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions 2019

Countdown Regression: Sharp and Calibrated Survival Predictions 2019