Pure Exploration and Regret Minimization in Matching Bandits

Flore Sentenac; Jialin Yi; Clément Calauzènes; Vianney Perchet; Milan Vojnovic

2021 ICML ICML 2021

Pure Exploration and Regret Minimization in Matching Bandits

Abstract

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms).

🧭 Keyword Pioneer — matching bandit

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Flore Sentenac , Jialin Yi , Clément Calauzènes , Vianney Perchet , Milan Vojnovic

Topics

Mathematics & Optimization > Optimization > Online Algorithms

Keywords

combinatorial optimization graph matching regret minimization semi-bandit feedback matching bandit

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021