Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

QIN DING; Cho-jui Hsieh; James Sharpnack

2022 AISTATS AISTATS 2022

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

Abstract

Stochastic linear contextual bandit algorithms have substantial applications in practice, such as recommender systems, online advertising, clinical trials, etc. Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks. Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards and cannot improve the robustness in the general and popular contextual bandit environment. In addition, none of the existing methods can defend against attacked context. In this work, we provide the first robust bandit algorithm for stochastic linear contextual bandit setting under a fully adaptive and omniscient attack with sub-linear regret. Our algorithm not only works under the attack of rewards, but also under attacked context. Moreover, it does not need any information about the attack budget or the particular form of the attack. We provide theoretical guarantees for our proposed algorithm and show by experiments that our proposed algorithm improves the robustness against various kinds of popular attacks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — reward perturbation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

QIN DING , Cho-jui Hsieh , James Sharpnack

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Learning Types > Adversarial Learning Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Optimization & Theory > Stochastic Methods Machine Learning > Learning Types > Multi-Armed Bandits Machine Learning > Application Areas > Recommender Systems

Keywords

adversarial robustness linear regression sub-linear regret adversarial attack contextual bandit reward perturbation attack context robust algorithm

Download PDF

Related papers

Exploring Image Regions Not Well Encoded by an INN 2022

On Linear Model with Markov Signal Priors 2022

Probabilistic Numerical Method of Lines for Time-Dependent Partial Differential Equations 2022

On Distributionally Robust Optimization and Data Rebalancing 2022

Common Failure Modes of Subcluster-based Sampling in Dirichlet Process Gaussian Mixture Models - and a Deep-learning Solution 2022