Multi-objective Bandits: Optimizing the Generalized Gini Index

Róbert Busa-Fekete; Balázs Szörényi; Paul Weng; Shie Mannor

2017 ICML ICML 2017

Multi-objective Bandits: Optimizing the Generalized Gini Index

Abstract

We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized. The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We propose an online gradient descent algorithm which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret $\tilde{O}(T^{-1/2} )$ with high probability. We test our algorithm on synthetic data as well as on an electric battery control problem where the goal is to trade off the use of the different cells of a battery in order to balance their respective degradation rates.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — generalized gini index

🐣 Hot Topic Early Bird — multi-objective optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Róbert Busa-Fekete , Balázs Szörényi , Paul Weng , Shie Mannor

Topics

Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Online Algorithms

Keywords

multi-objective optimization multi-armed bandit regret bound online gradient descent generalized gini index

Download PDF

Related papers

Bottleneck Conditional Density Estimation 2017

Constrained Policy Optimization 2017

Near-Optimal Design of Experiments via Regret Minimization 2017

Input Convex Neural Networks 2017

An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation 2017