2017
ALT
ALT 2017
New bounds on the price of bandit feedback for mistake-bounded online multiclass learning
Abstract
This paper is about two generalizations of the mistake bound model to online multiclass classification. In the standard model, the learner receives the correct classification at the end of each round, and in the bandit model, the learner only finds out whether its prediction was correct or not. For a set $F$ of multiclass classifiers, let $\mathrm{opt}_{\mathrm{std}}(F)$ and $\mathrm{opt}_{\mathrm{bandit}}(F)$ be the optimal bounds for learning $F$ according to these two models. We show that an $$ \mathrm{opt}_{\mathrm{bandit}}(F) \leq (1 + o(1)) (|Y| \ln |Y|) \mathrm{opt}_{\mathrm{std}}(F) $$ bound is the best possible up to the leading constant, closing a $\Theta(\log |Y|)$ factor gap.
🚀
Conference Pioneer
— ALT 2017
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization
🐣
Hot Topic Early Bird
— multi-class classification
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio