Analysis of a Random Forests Model

Gérard Biau

2012 JMLR JMLR 2012

Analysis of a Random Forests Model

Abstract

Random forests are a scheme proposed by Leo Breiman in the 2000's for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this paper, we offer an in-depth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present. [abs] [ pdf ][ bib ] © JMLR 2012. (edit, beta)

🧭 Keyword Pioneer — decision tree ensemble

🐣 Hot Topic Early Bird — statistical learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gérard Biau

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Statistical Learning

Keywords

statistical learning feature selection random forest decision tree ensemble

Download PDF

Related papers

Plug-in Approach to Active Learning 2012

An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity 2012

Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks 2012

GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression 2012

Query Strategies for Evading Convex-Inducing Classifiers 2012