2011
NIPS
NeurIPS 2011
Stochastic convex optimization with bandit feedback
Abstract
This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in X$. We demonstrate a generalization of the ellipsoid algorithm that incurs $O(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization
🧭
Keyword Pioneer
— ellipsoid algorithm
🐣
Hot Topic Early Bird
— regret bound
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Optimization & Theory > Learning Theory
Machine Learning > Optimization & Theory > Optimization
Mathematics & Optimization > Optimization > Stochastic Methods
Mathematics & Optimization > Optimization > Online Algorithms
Machine Learning > Learning Types > Online Learning
Machine Learning > Optimization & Theory > Stochastic Methods
Mathematics & Optimization > Optimization > Convex Optimization
Machine Learning > Learning Types > Stochastic Methods