2006
NIPS
NeurIPS 2006
PG-means: learning the number of clusters in data
Abstract
We present a novel algorithm called PG-means which is able to learn the number of clusters in a classical Gaussian mixture model. Our method is robust and efficient; it uses statistical hypothesis tests on one-dimensional projections of the data and model to determine if the examples are well represented by the model. In so doing, we are applying a statistical test for the entire model at once, not just on a per-cluster basis. We show that our method works well in difficult cases such as non-Gaussian data, overlapping clusters, eccentric clusters, high dimension, and many true clusters. Further, our new method provides a much more stable estimate of the number of clusters than existing methods.
🚀
Conference Pioneer
— NIPS 2006
🧭
Keyword Pioneer
— cluster number estimation
🐣
Hot Topic Early Bird
— unsupervised learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉
Interdisciplinary Bridge
— Data Science & Analytics and Machine Learning
📈
Trend Setter
— Unsupervised Learning