The Rate of Convergence of Adaboost

Indraneel Mukherjee; Cynthia Rudin; Robert E. Schapire

2011 COLT COLT 2011

The Rate of Convergence of Adaboost

Abstract

The AdaBoost algorithm of Freund and Schapire (1997) was designed to combine many “weak” hypotheses that perform slightly better than a random guess into a “strong” hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the “exponential loss” with a fast rate of convergence. Our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Specifically, our first result shows that at iteration $t$, the exponential loss of AdaBoost’s computed parameter vector will be at most $\varepsilon$ more than that of any parameter vector of $\ell_1$-norm bounded by $B$ in a number of rounds that is bounded by a polynomial in $B$ and $1/\varepsilon$. We also provide rate lower bound examples showing a polynomial dependence on these parameters is necessary. Our second result is that within $C/\varepsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\varepsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\varepsilon$ is optimal up to constant factors, i.e. at least $\Omega(1/\varepsilon)$ rounds are necessary to achieve within $\varepsilon$ of the optimal exponential loss.

🚀 Conference Pioneer — COLT 2011

📈 Trend Setter — Supervised Learning

🐣 Hot Topic Early Bird — convergence rate

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Indraneel Mukherjee , Cynthia Rudin , Robert E. Schapire

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Optimization Machine Learning > Learning Types > Supervised Learning Machine Learning > Learning Types > Ensemble Learning Machine Learning > Learning Types > Classification

Keywords

ensemble learning exponential loss weak learner convergence rate weak classifier

Download PDF

Related papers

Competitive Closeness Testing 2011

Bandits, Query Learning, and the Haystack Dimension 2011

Minimax Policies for Combinatorial Prediction Games 2011

Sample Complexity Bounds for Differentially Private Learning 2011

Multiclass Learnability and the ERM principle 2011