2007 NIPS NeurIPS 2007

Classification via Minimum Incremental Coding Length (MICL)

Abstract

We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the min- imum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Mini- mizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classi- fication criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as hand- written digits and human faces, without requiring domain-specific information.

🧭 Keyword Pioneer — minimum incremental coding length
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning
🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
📈 Trend Setter — Information Theory
🐣 Hot Topic Early Bird — information theory