Information Bottleneck for Non Co-Occurrence Data

Yevgeny Seldin; Noam Slonim; Naftali Tishby

2006 NIPS NeurIPS 2006

Information Bottleneck for Non Co-Occurrence Data

Abstract

We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X, Y , but rather as a sample of values of an unknown (stochastic) function Z (X, Y ). For example, in gene expression data, the expression level Z is a function of gene X and condition Y ; or in movie ratings data the rating Z is a function of viewer X and movie Y . The approach represents a consistent extension of the Information Bottleneck method that has previously relied on the availability of co-occurrence statistics. By altering the relevance variable we eliminate the need in the sample of joint distribution of all input variables. This new formulation also enables simple MDL-like model complexity control and prediction of missing values of Z . The approach is analyzed and shown to be on a par with the best known clustering algorithms for a wide range of domains. For the prediction of missing values (collaborative filtering) it improves the currently best known results.

🚀 Conference Pioneer — NIPS 2006

🌱 Topic Pioneer — Recommender Systems

🌉 Interdisciplinary Bridge — Data Science & Analytics and Machine Learning and Mathematics & Optimization

📈 Trend Setter — Recommender Systems

🧭 Keyword Pioneer — missing value prediction

🐣 Hot Topic Early Bird — representation learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Yevgeny Seldin , Noam Slonim , Naftali Tishby

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Core Methods > Representation Learning Data Science & Analytics > Applications > Recommender Systems Mathematics & Optimization > Mathematics > Information Theory Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Learning Types > Representation Learning Machine Learning > Core Methods > Feature Learning Machine Learning > Learning Types > Clustering

Keywords

representation learning feature extraction information bottleneck collaborative filtering missing value prediction minimum description length clustering algorithm mdl principle

Download PDF

Related papers

Temporal Coding using the Response Properties of Spiking Neurons 2006

Parameter Expanded Variational Bayesian Methods 2006

Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning 2006

Ordinal Regression by Extended Binary Classification 2006

Blind source separation for over-determined delayed mixtures 2006