Information Directed Sampling for Linear Partial Monitoring

Johannes Kirschner; Tor Lattimore; Andreas Krause

2020 COLT COLT 2020

Information Directed Sampling for Linear Partial Monitoring

Abstract

Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce {\em information directed sampling} (IDS) for stochastic partial monitoring with a linear reward and observation structure. IDS achieves adaptive worst-case regret rates that depend on precise observability conditions of the game. Moreover, we prove lower bounds that classify the minimax regret of all finite games into four possible regimes. IDS achieves the optimal rate in all cases up to logarithmic factors, without tuning any hyper-parameters. We further extend our results to the contextual and the kernelized setting, which significantly increases the range of possible applications.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Johannes Kirschner , Tor Lattimore , Andreas Krause

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems

Keywords

contextual bandit partial monitoring linear bandit kernelized bandit adversarial regret information directed sampling

Download PDF

Related papers

Open Problem: Average-Case Hardness of Hypergraphic Planted Clique Detection 2020

Highly smooth minimization of non-smooth problems 2020

Closure Properties for Private Classification and Online Prediction 2020

Efficient, Noise-Tolerant, and Private Learning via Boosting 2020

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit 2020