On the Asymptotic Optimality of Confidence Interval Based Algorithms for Fixed Confidence MABs

Kushal Kejriwal; Nikhil Karamchandani; Jayakrishnan Nair

2025 AAAI AAAI 2025

On the Asymptotic Optimality of Confidence Interval Based Algorithms for Fixed Confidence MABs

Abstract

Abstract In this work, we address the challenge of identifying the optimal arm in a stochastic multi-armed bandit scenario with the minimum number of arm pulls, given a predefined error probability in a fixed confidence setting. Our focus is on examining the asymptotic behavior of sample complexity and the distribution of arm weights upon termination, as the error threshold is scaled to zero, under confidence-interval based algorithms. Specifically, we analyze the asymptotic sample complexity and termination weight fractions for the well-known LUCB algorithm, and introduce a new variant, the LUCB Greedy algorithm. We demonstrate that the upper bounds on the sample complexities for both algorithms are asymptotically within a constant factor of the established lower bounds.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kushal Kejriwal , Nikhil Karamchandani , Jayakrishnan Nair

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Mathematics & Optimization > Optimization > Stochastic Methods Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Multi-Armed Bandits Artificial Intelligence > Core AI > Game Theory Machine Learning > Optimization & Theory > Sample Complexity

Keywords

sample complexity asymptotic analysis asymptotic optimality multi-armed bandit confidence interval optimal arm identification fixed confidence

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025