2022
JMLR
JMLR 2022
Regularized K-means Through Hard-Thresholding
Abstract
We study a framework for performing regularized K-means, based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared in a theoretical analysis and an extensive Monte Carlo simulation study. Based on the results, we propose a new method called hard-threshold K-means (HTK-means), which uses an ℓ0 penalty to induce sparsity. HTK-means is a fast and competitive sparse clustering method which is easily interpretable, as is illustrated on several real data examples. In this context, new graphical displays are presented and used to gain further insight into the data sets. [abs] [ pdf ][ bib ] [ code ] © JMLR 2022. (edit, beta)
🌉
Interdisciplinary Bridge
— Data Science & Analytics and Machine Learning
🧭
Keyword Pioneer
— sparse clustering
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio