2020 JMLR JMLR 2020

Recovery of a Mixture of Gaussians by Sum-of-Norms Clustering

Abstract

Sum-of-norms clustering is a method for assigning $n$ points in $\mathbf{R}^d$ to $K$ clusters, $1\le K\le n$, using convex optimization. Recently, Panahi (2017) proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The purpose of this note is to lift this restriction, that is, show that sum-of-norms clustering can recover a mixture of Gaussians even as the number of samples tends to infinity. Our proof relies on an interesting characterization of clusters computed by sum-of-norms clustering that was developed inside a proof of the agglomeration conjecture by Chiquet et al. (2017). Because we believe this theorem has independent interest, we restate and reprove the Chiquet et al. (2017) result herein. [abs] [ pdf ][ bib ] © JMLR 2020. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
🧭 Keyword Pioneer — sum-of-norms clustering
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy