Finding a most biased coin with fewest flips

Karthekeyan Chandrasekaran; Richard Karp

2014 COLT COLT 2014

Finding a most biased coin with fewest flips

Abstract

We study the problem of learning a most biased coin among a set of coins by tossing the coins adaptively. The goal is to minimize the number of tosses until we identify a coin whose posterior probability of being most biased is at least 1-δfor a given δ. Under a particular probabilistic model, we give an optimal algorithm, i.e., an algorithm that minimizes the expected number of future tosses. The problem is closely related to finding the best arm in the multi-armed bandit problem using adaptive strategies. Our algorithm employs an optimal adaptive strategy—a strategy that performs the best possible action at each step after observing the outcomes of all previous coin tosses. Consequently, our algorithm is also optimal for any given starting history of outcomes. To our knowledge, this is the first algorithm that employs an optimal adaptive strategy under a Bayesian setting for this problem. Our proof of optimality employs mathematical tools from the area of Markov games.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — markov game

🐣 Hot Topic Early Bird — markov game

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Karthekeyan Chandrasekaran , Richard Karp

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

bayesian inference sequential decision posterior probability multi-armed bandit markov game adaptive strategy optimal exploration optimal stopping

Download PDF

Related papers

Open Problem: Shifting Experts on Easy Data 2014

Lipschitz Bandits: Regret Lower Bound and Optimal Algorithms 2014

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity 2014

Principal Component Analysis and Higher Correlations for Distributed Data 2014

Compressed Counting Meets Compressed Sensing 2014