Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Runzhe Wan; Lin Ge; Rui Song

2023 AISTATS AISTATS 2023

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Abstract

Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a wide class of structured bandit problems where the parameter space can be factorized to item-level, which covers many popular tasks. Compared with existing approaches, the proposed solution is both scalable to large systems and robust by utilizing a more flexible model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Theoretical analysis and extensive numerical results both support the usefulness of the proposed method.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Runzhe Wan , Lin Ge , Rui Song

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Artificial Intelligence > Learning Paradigms > Meta-Learning Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Learning Paradigms > Meta-Learning Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

online learning thompson sampling bayesian hierarchical model structured bandit

Download PDF

Related papers

Safe Sequential Testing and Effect Estimation in Stratified Count Data 2023

Who Should Predict? Exact Algorithms For Learning to Defer to Humans 2023

An Online and Unified Algorithm for Projection Matrix Vector Multiplication with Application to Empirical Risk Minimization 2023

Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods 2023

The Ordered Matrix Dirichlet for State-Space Models 2023