Large-Scale Bandit Problems and KWIK Learning

Jacob Abernethy; Kareem Amin; Michael Kearns; Moez Draief

2013 ICML ICML 2013

Large-Scale Bandit Problems and KWIK Learning

Abstract

We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.

🚀 Conference Pioneer — ICML 2013

🐣 Hot Topic Early Bird — supervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jacob Abernethy , Kareem Amin , Michael Kearns , Moez Draief

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization

Keywords

online learning supervised learning kwik learning multi-armed bandit

Download PDF

Related papers

Convex Adversarial Collective Classification 2013

Gaussian Process Vine Copulas for Multivariate Dependence 2013

Stochastic Simultaneous Optimistic Optimization 2013

Generic Exploration and K-armed Voting Bandits 2013

Robust Structural Metric Learning 2013