Stochastic Dueling Bandits with Adversarial Corruption

Arpit Agarwal; Shivani Agarwal; Prathamesh Patil

2021 ALT ALT 2021

Stochastic Dueling Bandits with Adversarial Corruption

Abstract

The dueling bandits problem has received a lot of attention in recent years due to its applications in recommendation systems and information retrieval. However, due to the prevalence of malicious users in these systems, it is becoming increasingly important to design dueling bandit algorithms that are robust to corruptions introduced by these malicious users. In this paper we study dueling bandits in the presence of an adversary that can corrupt some of the feedback received by the learner. We propose an algorithm for this problem that is agnostic to the amount of corruption introduced by the adversary: its regret degrades gracefully with the amount of corruption, and in case of no corruption, it essentially matches the optimal regret bounds achievable in the purely stochastic dueling bandits setting.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Arpit Agarwal , Shivani Agarwal , Prathamesh Patil

Topics

Machine Learning > Application Areas > Risk Management Machine Learning > Optimization & Theory > Online Algorithms

Keywords

adversarial corruption bandit algorithm recommendation system dueling bandit

Download PDF

Related papers

Statistical guarantees for generative models without domination 2021

Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization 2021

Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time 2021

Efficient sampling from the Bingham distribution 2021

Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance 2021