Faster Boosting with Smaller Memory

Julaiti Alafate; Yoav S Freund

2019 NIPS NeurIPS 2019

Faster Boosting with Smaller Memory

Abstract

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.

📈 Trend Setter — Ensemble Learning

🐣 Hot Topic Early Bird — gradient boosting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Julaiti Alafate , Yoav S Freund

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Optimization Machine Learning > Core Methods > Ensemble Learning Machine Learning > Core Methods > Optimization Machine Learning > Optimization & Theory > Efficient Computing

Keywords

ensemble learning computational efficiency stratified sampling gradient boosting algorithm optimization early stopping memory efficiency

Download PDF

Related papers

Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test 2019

Metalearned Neural Memory 2019

Model Similarity Mitigates Test Set Overuse 2019

Continual Unsupervised Representation Learning 2019

Reinforcement Learning with Convex Constraints 2019