2024
NIPS
NeurIPS 2024
Efficient Centroid-Linkage Clustering
Abstract
We give an algorithm for Centroid-Linkage Hierarchical Agglomerative Clustering (HAC), which computes a $c$-approximate clustering in roughly $n^{1+O(1/c^2)}$ time. We obtain our result by combining a new centroid-linkage HAC algorithm with a novel fully dynamic data structure for nearest neighbor search which works under adaptive updates.We also evaluate our algorithm empirically. By leveraging a state-of-the-art nearest-neighbor search library, we obtain a fast and accurate centroid-linkage HAC algorithm. Compared to an existing state-of-the-art exact baseline, our implementation maintains the clustering quality while delivering up to a $36\times$ speedup due to performing fewer distance comparisons.
🧭
Keyword Pioneer
— centroid-linkage clustering
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
🌉
Interdisciplinary Bridge
— Computer Science and Data Science & Analytics and Machine Learning and Mathematics & Optimization
Authors
Topics
Machine Learning > Core Methods > Clustering
Machine Learning > Optimization & Theory > Optimization
Data Science & Analytics > Applications > Clustering
Computer Science > Foundations > Algorithms
Machine Learning > Core Methods > Dimensionality Reduction
Mathematics & Optimization > Optimization > Approximation Algorithms