Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Ofer Dekel; Ronen Eldan; Tomer Koren

2015 NIPS NeurIPS 2015

Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Abstract

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of $\widetilde{O}(T^{5/6})$, while the best known lower bound is $\Omega(T^{1/2})$. Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of $\widetilde{O}(T^{2/3})$. We present an efficient algorithm for the banditsmooth convex optimization problem that guarantees a regret of $\widetilde{O}(T^{5/8})$. Our result rules out an $\Omega(T^{2/3})$ lower bound and takes a significant step towards the resolution of this open problem.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — smooth convex optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ofer Dekel , Ronen Eldan , Tomer Koren

Topics

Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Online Algorithms

Keywords

online learning regret bound bandit convex optimization smooth convex optimization

Download PDF

Related papers

Data Generation as Sequential Decision Making 2015

A Recurrent Latent Variable Model for Sequential Data 2015

Combinatorial Cascading Bandits 2015

Accelerated Mirror Descent in Continuous and Discrete Time 2015

Matrix Completion with Noisy Side Information 2015