Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

Alexandra Carpentier; Andrea Locatelli

2016 COLT COLT 2016

Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

Abstract

We consider the problem of \textitbest arm identification with a \textitfixed budget T, in the K-armed stochastic bandit setting, with arms distribution defined on [0,1]. We prove that any bandit strategy, for at least one bandit problem characterized by a complexity H, will misidentify the best arm with probability lower bounded by $\exp\Big(-\frac{T}\log(K)H\Big)$, where $H$ is the sum for all sub-optimal arms of the inverse of the squared gaps. Our result disproves formally the general belief - coming from results in the fixed confidence setting - that there must exist an algorithm for this problem whose probability of error is upper bounded by $\exp(-T/H)$. This also proves that some existing strategies based on the Successive Rejection of the arms are optimal - closing therefore the current gap between upper and lower bounds for the fixed budget best arm identification problem.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — fixed budget

🐣 Hot Topic Early Bird — lower bound

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Alexandra Carpentier , Andrea Locatelli

Topics

Machine Learning > Optimization & Theory > Learning Theory Mathematics & Optimization > Optimization > Online Algorithms

Keywords

best arm identification multi-armed bandit lower bound stochastic bandit fixed budget

Download PDF

Related papers

Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies 2016

Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture 2016

Open Problem: Kernel methods on manifolds and metric spaces. What is the probability of a positive definite geodesic exponential kernel? 2016

Learning and Testing Junta Distributions 2016

Monte Carlo Markov Chain Algorithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes 2016