Learning Fair Division from Bandit Feedback

Hakuei Yamada; Junpei Komiyama; Kenshi Abe; Atsushi Iwasaki

2024 AISTATS AISTATS 2024

Learning Fair Division from Bandit Feedback

Abstract

This work addresses learning online fair division under uncertainty, where a central planner sequentially allocates items without precise knowledge of agents’ values or utilities. Departing from conventional online algorithms, the planner here relies on noisy, estimated values obtained after allocating items. We introduce wrapper algorithms utilizing dual averaging, enabling gradual learning of both the type distribution of arriving items and agents’ values through bandit feedback. This approach enables the algorithms to asymptotically achieve optimal Nash social welfare in linear Fisher markets with agents having additive utilities. We also empirically verify the performance of the proposed algorithms across synthetic and empirical datasets.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — online fair division

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

Authors

Hakuei Yamada , Junpei Komiyama , Kenshi Abe , Atsushi Iwasaki

Topics

Machine Learning > Learning Types > Active Learning Mathematics & Optimization > Optimization > Stochastic Methods Mathematics & Optimization > Optimization > Online Algorithms Mathematics & Optimization > Optimization > Game Theory Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

dual averaging bandit feedback multi-armed bandit nash social welfare fair division online fair division linear fisher market

Download PDF

Related papers

Causal Bandits with General Causal Models and Interventions 2024

Boundary-Aware Uncertainty for Feature Attribution Explainers 2024

Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective 2024

A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning 2024

Pure Exploration in Bandits with Linear Constraints 2024