An explore-then-commit algorithm for submodular maximization under full-bandit feedback

Guanyu Nie; Mridul Agarwal; Abhishek Kumar Umrawal; Vaneet Aggarwal; Christopher John Quinn

2022 UAI UAI 2022

An explore-then-commit algorithm for submodular maximization under full-bandit feedback

Abstract

We investigate the problem of combinatorial multi-armed bandits with stochastic submodular (in expectation) rewards and full-bandit feedback, where no extra information other than the reward of selected action at each time step $t$ is observed. We propose a simple algorithm, Explore-Then-Commit Greedy (ETCG) and prove that it achieves a $(1-1/e)$-regret upper bound of $\mathcal{O}(n^\frac{1}{3}k^\frac{4}{3}T^\frac{2}{3}\log(T)^\frac{1}{2})$ for a horizon $T$, number of base elements $n$, and cardinality constraint $k$. We also show in experiments with synthetic and real-world data that the ETCG empirically outperforms other full-bandit methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Guanyu Nie , Mridul Agarwal , Abhishek Kumar Umrawal , Vaneet Aggarwal , Christopher John Quinn

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Mathematics & Optimization > Optimization > Combinatorial Optimization Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

submodular maximization multi-armed bandit regret bound full-bandit feedback

Download PDF

Related papers

Combating the instability of mutual information-based losses via regularization 2022

Future gradient descent for adapting the temporal shifting data distribution in online recommendation systems 2022

Privacy-aware compression for federated data analysis 2022

Fixing the Bethe approximation: How structural modifications in a graph improve belief propagation 2022

Probabilistic spatial transformer networks 2022