Pareto Front Identification from Stochastic Bandit Feedback

Peter Auer; Chao-Kai Chiang; Ronald Ortner; Madalina Drugan

2016 AISTATS AISTATS 2016

Pareto Front Identification from Stochastic Bandit Feedback

Abstract

We consider the problem of identifying the Pareto front for multiple objectives from a finite set of operating points. Sampling an operating point gives a random vector where each coordinate corresponds to the value of one of the objectives. The Pareto front is the set of operating points that are not dominated by any other operating point in respect to all objectives (considering the mean of their objective values). We propose a confidence bound algorithm to approximate the Pareto front, and prove problem specific lower and upper bounds, showing that the sample complexity is characterized by some natural geometric properties of the operating points. Experiments confirm the reliability of our algorithm. For the problem of finding a sparse cover of the Pareto front, we propose an asymmetric covering algorithm of independent interest.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🐣 Hot Topic Early Bird — multi-objective optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Peter Auer , Chao-Kai Chiang , Ronald Ortner , Madalina Drugan

Topics

Machine Learning > Optimization & Theory > Theory Mathematics & Optimization > Optimization > Online Algorithms

Keywords

sample complexity multi-objective optimization stochastic bandit confidence bound pareto optimization

Download PDF

Related papers

Bipartite Correlation Clustering: Maximizing Agreements 2016

Precision Matrix Estimation in High Dimensional Gaussian Graphical Models with Faster Rates 2016

On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes 2016

Time-Varying Gaussian Process Bandit Optimization 2016

Bayesian Markov Blanket Estimation 2016