2020 AISTATS AISTATS 2020

The True Sample Complexity of Identifying Good Arms

Abstract

We consider two multi-armed bandit problems with $n$ arms: \emph{(i)} given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and \emph{(ii)} given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$. Existing lower bounds and algorithms for the PAC framework suggest that both of these problems require $\Omega(n)$ samples. However, we argue that the PAC framework not only conflicts with how these algorithms are used in practice, but also that these results disagree with intuition that says \emph{(i)} requires only $\Theta(\frac{n}{m})$ samples where $m = |\{ i : \mu_i > \max_{j \in [n]} \mu_j - \epsilon\}|$ and \emph{(ii)} requires $\Theta(\frac{n}{m}k)$ samples where $m = |\{ i : \mu_i > \mu_0 \}|$. We provide definitions that formalize these intuitions, obtain lower bounds that match the above sample complexities, and develop explicit, practical algorithms that achieve nearly matching upper bounds.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio