2016 AISTATS AISTATS 2016

Bipartite Correlation Clustering: Maximizing Agreements

Abstract

In Bipartite Correlation Clustering (BCC) we are given a complete bipartite graph G with ’+’ and ’-’ edges, and we seek a vertex clustering that maximizes the number of agreements: the number of all ’+’ edges within clusters plus all ’-’ edges cut across clusters. BCC is known to be NP-hard [5]. We present a novel approximation algorithm for k-BCC, a variant of BCC with an upper bound k on the number of clusters. Our algorithm outputs a k-clustering that provably achieves a number of agreements within a multiplicative (1-δ)-factor from the optimal, for any desired accuracy δ. It relies on solving a combinatorially constrained bilinear maximization on the bi-adjacency matrix of G. It runs in time exponential in k and 1/δ, but linear in the size of the input. Further, we show that, in the (unconstrained) BCC setting, an (1-δ)-approximation can be achieved by O(1/δ) clusters regardless of the size of the graph. In turn, our k-BCC algorithm implies an Efficient PTAS for the BCC objective of maximizing agreements.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
🧭 Keyword Pioneer — bipartite clustering
🐣 Hot Topic Early Bird — combinatorial optimization
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio