Modeling Dyadic Data with Binary Latent Factors

Edward Meeds; Zoubin Ghahramani; Radford M. Neal; Sam T. Roweis

2006 NIPS NeurIPS 2006

Modeling Dyadic Data with Binary Latent Factors

Abstract

We introduce binary matrix factorization, a novel model for unsupervised ma- trix decomposition. The decomposition is learned by ﬁtting a non-parametric Bayesian probabilistic model with binary latent variables to a matrix of dyadic data. Unlike bi-clustering models, which assign each row or column to a single cluster based on a categorical hidden feature, our binary feature model reﬂects the prior belief that items and attributes can be associated with more than one latent cluster at a time. We provide simple learning and inference rules for this new model and show how to extend it to an inﬁnite model in which the number of features is not a priori ﬁxed but is allowed to grow with the size of the data. 1 Distributed representations for dyadic data One of the major goals of probabilistic unsupervised learning is to discover underlying or hidden structure in a dataset by using latent variables to describe a complex data generation process. In this paper we focus on dyadic data: our domains have two ﬁnite sets of objects/entities and observa- tions are made on dyads (pairs with one element from each set). Examples include sparse matrices of movie-viewer ratings, word-document counts or product-customer purchases. A simple way to capture structure in this kind of data is to do “bi-clustering” (possibly using mixture models) by grouping the rows and (independently or simultaneously) the columns[6, 13, 9]. The modelling as- sumption in such a case is that movies come in types and viewers in types and that knowing componential structure: each item (row) has associated with it an unobserved vector of binary features; similarly each attribute (column) has a hidden vector of binary features. Knowing the matrixX into (a distribution deﬁned by) the productUWV> , whereU andV are binary feature matrices, andW is a real-valued weight matrix. Below, we develop this binary matrix factorization the type of movie and type of viewer is sufﬁcient to predict the response. Clustering

🚀 Conference Pioneer — NIPS 2006

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

📈 Trend Setter — Variational Inference

🧭 Keyword Pioneer — dyadic data

🐣 Hot Topic Early Bird — unsupervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Edward Meeds , Zoubin Ghahramani , Radford M. Neal , Sam T. Roweis

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Unsupervised Learning Deep Learning > Models > Variational Inference Machine Learning > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Core Methods > Matrix Factorization

Keywords

unsupervised learning matrix decomposition nonparametric bayesian probabilistic modeling variational inference dyadic data matrix factorization binary matrix factorization non-parametric bayesian unsupervised matrix decomposition latent variable model latent factor model latent variable dyadic datum binary latent factor

Download PDF

Related papers

Temporal Coding using the Response Properties of Spiking Neurons 2006

Parameter Expanded Variational Bayesian Methods 2006

Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning 2006

Ordinal Regression by Extended Binary Classification 2006

Blind source separation for over-determined delayed mixtures 2006