Information Directed Sampling for Sparse Linear Bandits

Botao Hao; Tor Lattimore; Wei Deng

2021 NIPS NeurIPS 2021

Information Directed Sampling for Sparse Linear Bandits

Abstract

Stochastic sparse linear bandits offer a practical model for high-dimensional online decision-making problems and have a rich information-regret structure. In this work we explore the use of information-directed sampling (IDS), which naturally balances the information-regret trade-off. We develop a class of information-theoretic Bayesian regret bounds that nearly match existing lower bounds on a variety of problem instances, demonstrating the adaptivity of IDS. To efficiently implement sparse IDS, we propose an empirical Bayesian approach for sparse posterior sampling using a spike-and-slab Gaussian-Laplace prior. Numerical results demonstrate significant regret reductions by sparse IDS relative to several baselines.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — sparse linear bandit

🐣 Hot Topic Early Bird — posterior sampling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Botao Hao , Tor Lattimore , Wei Deng

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Multi-Armed Bandits Machine Learning > Bayesian & Probabilistic > Bayesian Optimization Machine Learning > Learning Types > Sparse Learning

Keywords

empirical bayesian posterior sampling spike-and-slab prior bayesian regret bayesian regret bound sparse linear bandit information directed sampling

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021