Learning Ensembles from Bites: A Scalable and Accurate Approach

Nitesh V. Chawla; Lawrence O. Hall; Kevin W. Bowyer; W. Philip
  Kegelmeyer

2004 JMLR JMLR 2004

Learning Ensembles from Bites: A Scalable and Accurate Approach

Abstract

Bagging and boosting are two popular ensemble methods that typically achieve better accuracy than a single classifier. These techniques have limitations on massive data sets, because the size of the data set can be a bottleneck. Voting many classifiers built on small subsets of data ("pasting small votes") is a promising approach for learning from massive data sets, one that can utilize the power of boosting and bagging. We propose a framework for building hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach is fast, accurate, and scalable. [abs] [ pdf ][ ps.gz ][ ps ]

📈 Trend Setter — Efficient Computing

🧭 Keyword Pioneer — distributed learning

🐣 Hot Topic Early Bird — distributed learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Nitesh V. Chawla , Lawrence O. Hall , Kevin W. Bowyer , W. Philip Kegelmeyer

Topics

Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Efficient Computing Machine Learning > Learning Types > Ensemble Learning

Keywords

ensemble learning scalable learning distributed learning classifier ensemble massive dataset

Download PDF

Related papers

Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees 2004

Fast String Kernels using Inexact Matching for Protein Sequences 2004

Learning the Kernel Matrix with Semidefinite Programming 2004

Weather Data Mining Using Independent Component Analysis 2004

A Geometric Approach to Multi-Criterion Reinforcement Learning 2004