A Parameter-free Hedging Algorithm

Kamalika Chaudhuri; Yoav Freund; Daniel J. Hsu

2009 NIPS NeurIPS 2009

A Parameter-free Hedging Algorithm

Abstract

We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a major barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. In addition, we introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

📈 Trend Setter — Online Algorithms

🧭 Keyword Pioneer — parameter-free algorithm

🐣 Hot Topic Early Bird — online learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kamalika Chaudhuri , Yoav Freund , Daniel J. Hsu

Topics

Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Online Learning Machine Learning > Optimization & Theory > Online Algorithms

Keywords

online learning regret minimization parameter-free algorithm decision-theoretic hedging algorithm decision-theoretic online learning multi-armed bandit regret bound

Download PDF

Related papers

Solving Stochastic Games 2009

Bilinear classifiers for visual recognition 2009

Zero-shot Learning with Semantic Output Codes 2009

Matrix Completion from Power-Law Distributed Samples 2009

Heavy-Tailed Symmetric Stochastic Neighbor Embedding 2009