2012 AISTATS AISTATS 2012

Information Theoretic Model Validation for Spectral Clustering

Abstract

Model validation constitutes a fundamental step in data clustering. The central question is: Which cluster model and how many clusters are most appropriate for a certain application? In this study, we introduce a method for the validation of spectral clustering based upon approximation set coding. In particular, we compare correlation and pairwise clustering to analyze the correlations of temporal gene expression profiles. To evaluate and select clustering models, we calculate their reliable informativeness. Experimental results in the context of gene expression analysis show that pairwise clustering yields superior amounts of reliable information. The analysis results are consistent with the Bayesian Information Criterion (BIC), and exhibit higher generality than BIC.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
🧭 Keyword Pioneer — approximation set coding
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization
🐣 Hot Topic Early Bird — information theory