On Multilabel Classification and Ranking with Bandit Feedback

Claudio Gentile; Francesco Orabona

2014 JMLR JMLR 2014

On Multilabel Classification and Ranking with Bandit Feedback

Abstract

We present a novel multilabel/ranking algorithm working in partial information settings. The algorithm is based on 2nd- order descent methods, and relies on upper-confidence bounds to trade-off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where covariates can be adversarial, but multilabel probabilities are ruled by (generalized) linear models. We show $O(T^{1/2}\log T)$ regret bounds, which improve in several ways on the existing results. We test the effectiveness of our upper-confidence scheme by contrasting against full-information baselines on diverse real- world multilabel data sets, often obtaining comparable performance. [abs] [ pdf ][ bib ] © JMLR 2014. (edit, beta)

🐣 Hot Topic Early Bird — bandit feedback

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio