2022 IJCAI IJCAI 2022

FLS: A New Local Search Algorithm for K-means with Smaller Search Space

Abstract

The k-means problem is an extensively studied unsupervised learning problem with various applications in decision making and data mining. In this paper, we propose a fast and practical local search algorithm for the k-means problem. Our method reduces the search space of swap pairs from O(nk) to O(k^2), and applies random mutations to find potentially better solutions when local search falls into poor local optimum. With the assumption of data distribution that each optimal cluster has "average" size of \Omega(n/k), which is common in many datasets and k-means benchmarks, we prove that our proposed algorithm gives a (100+\epsilon)-approximate solution in expectation. Empirical experiments show that our algorithm achieves better performance compared to existing state-of-the-art local search methods on k-means benchmarks and large datasets.

🧭 Keyword Pioneer β€” local search algorithm
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning
πŸŒ‰ Interdisciplinary Bridge β€” Machine Learning and Mathematics & Optimization