Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Alina Beygelzimer; Francesco Orabona; Chicheng Zhang

2017 ICML ICML 2017

Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Abstract

We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.

🐣 Hot Topic Early Bird — multiclass classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Alina Beygelzimer , Francesco Orabona , Chicheng Zhang

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Online Algorithms

Keywords

stochastic optimization online learning multiclass classification regret bound bandit algorithm

Download PDF

Related papers

Bottleneck Conditional Density Estimation 2017

Constrained Policy Optimization 2017

Near-Optimal Design of Experiments via Regret Minimization 2017

Input Convex Neural Networks 2017

An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation 2017