Fair Algorithms for Multi-Agent Multi-Armed Bandits

Safwan Hossain; Evi Micha; Nisarg Shah

2021 NIPS NeurIPS 2021

Fair Algorithms for Multi-Agent Multi-Armed Bandits

Abstract

We propose a multi-agent variant of the classical multi-armed bandit problem, in which there are $N$ agents and $K$ arms, and pulling an arm generates a (possibly different) stochastic reward for each agent. Unlike the classical multi-armed bandit problem, the goal is not to learn the "best arm"; indeed, each agent may perceive a different arm to be the best for her personally. Instead, we seek to learn a fair distribution over the arms. Drawing on a long line of research in economics and computer science, we use the Nash social welfare as our notion of fairness. We design multi-agent variants of three classic multi-armed bandit algorithms and show that they achieve sublinear regret, which is now measured in terms of the lost Nash social welfare. We also extend a classical lower bound, establishing the optimality of one of our algorithms.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — fair algorithm

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Safwan Hossain , Evi Micha , Nisarg Shah

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Application Areas > Fairness Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Multi-Agent Systems Machine Learning > Learning Types > Fairness

Keywords

multi-armed bandit regret bound nash social welfare multi-agent system fair algorithm

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021