An Analysis of Ensemble Sampling

Chao Qin; Zheng Wen; Xiuyuan Lu; Benjamin Van Roy

2022 NIPS NeurIPS 2022

An Analysis of Ensemble Sampling

Abstract

Ensemble sampling serves as a practical approximation to Thompson sampling when maintaining an exact posterior distribution over model parameters is computationally intractable. In this paper, we establish a regret bound that ensures desirable behavior when ensemble sampling is applied to the linear bandit problem. This represents the first rigorous regret analysis of ensemble sampling and is made possible by leveraging information-theoretic concepts and novel analytic techniques that may prove useful beyond the scope of this paper.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chao Qin , Zheng Wen , Xiuyuan Lu , Benjamin Van Roy

Topics

Machine Learning > Optimization & Theory > Learning Theory Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Multi-Armed Bandits Machine Learning > Learning Types > Exploration Machine Learning > Learning Types > Sampling

Keywords

information theory online learning thompson sampling regret bound linear bandit ensemble sampling

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022