Classification via Minimum Incremental Coding Length (MICL)

John Wright; Yangyu Tao; Zhouchen Lin; Yi Ma; Heung-yeung Shum

2007 NIPS NeurIPS 2007

Classification via Minimum Incremental Coding Length (MICL)

Abstract

We present a simple new criterion for classiﬁcation, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the min- imum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classiﬁers. Theoretical results provide new insights into relationships among popular classiﬁers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Mini- mizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classi- ﬁcation criterion and its kernel and local versions perform competitively against existing classiﬁers on both synthetic examples and real imagery data such as hand- written digits and human faces, without requiring domain-speciﬁc information.

🧭 Keyword Pioneer — minimum incremental coding length

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

📈 Trend Setter — Information Theory

🐣 Hot Topic Early Bird — information theory

Authors

John Wright , Yangyu Tao , Zhouchen Lin , Yi Ma , Heung-yeung Shum

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Representation Learning Mathematics & Optimization > Mathematics > Information Theory

Keywords

information theory classification density estimation kernel classification minimum incremental coding length data compression lossy compression kernel methods lossy data compression minimum coding length map classifier rda classifier

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007