Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing

Yuan Zhou; Xi Chen; Jian Li

2014 ICML ICML 2014

Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing

Abstract

We study the problem of selecting K arms with the highest expected rewards in a stochastic N-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability or the metric in EXPLORE-K), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1-δ, identifies a set of K arms with regret at most ε. We provide the sample complexity bound of our algorithm. To complement, we establish the lower bound and show that the sample complexity of our algorithm matches the lower bound. Finally, we report experimental results on both synthetic and real data sets, which demonstrates the superior performance of the proposed algorithm.

🧭 Keyword Pioneer — arm identification

🐣 Hot Topic Early Bird — sample complexity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

Authors

Yuan Zhou , Xi Chen , Jian Li

Topics

Machine Learning > Learning Types > Active Learning Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Statistical Learning Artificial Intelligence > Core AI > Reasoning Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

sample complexity pac learning multi-armed bandit arm identification aggregate regret

Download PDF

Related papers

Demystifying Information-Theoretic Clustering 2014

Margins, Kernels and Non-linear Smoothed Perceptrons 2014

Large-Margin Metric Learning for Constrained Partitioning Problems 2014

Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function 2014

Generalized Exponential Concentration Inequality for Renyi Divergence Estimation 2014