2009
NIPS
NeurIPS 2009
Unsupervised Feature Selection for the $k$-means Clustering Problem
Abstract
We present a novel feature selection algorithm for the $k$-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter $\epsilon \in (0,1)$, selects and appropriately rescales in an unsupervised manner $\Theta(k \log(k / \epsilon) / \epsilon^2)$ features from a dataset of arbitrary dimensions. We prove that, if we run any $\gamma$-approximate $k$-means algorithm ($\gamma \geq 1$) on the features selected using our method, we can find a $(1+(1+\epsilon)\gamma)$-approximate partition with high probability.
🧭
Keyword Pioneer
— unsupervised feature selection
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
📈
Trend Setter
— Unsupervised Learning
🐣
Hot Topic Early Bird
— k-means clustering