The LORACs Prior for VAEs: Letting the Trees Speak for the Data

Sharad Vikram; Matthew D. Hoffman; Matthew J. Johnson

2019 AISTATS AISTATS 2019

The LORACs Prior for VAEs: Letting the Trees Speak for the Data

Abstract

In variational autoencoders, the prior on the latent codes $z$ is often treated as an afterthought, but the prior shapes the kind of latent representation that the model learns. If the goal is to learn a representation that is interpretable and useful, then the prior should reflect the ways in which the high-level factors that describe the data vary. The “default” prior is a standard normal, but if the natural factors of variation in the dataset exhibit discrete structure or are not independent, then the isotropic-normal prior will actually encourage learning representations that \emph{mask} this structure. To alleviate this problem, we propose using a flexible Bayesian nonparametric hierarchical clustering prior based on the time-marginalized coalescent (TMC). To scale learning to large datasets, we develop a new inducing-point approximation and inference algorithm. We then apply the method without supervision to several datasets and examine the interpretability and practical performance of the inferred hierarchies and learned latent space.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — bayesian nonparametric prior

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — hierarchical clustering

Authors

Sharad Vikram , Matthew D. Hoffman , Matthew J. Johnson

Topics

Machine Learning > Core Methods > Representation Learning Deep Learning > Models > Generative Models Deep Learning > Models > Variational Inference Machine Learning > Bayesian & Probabilistic > Variational Inference

Keywords

bayesian nonparametrics variational inference hierarchical clustering latent representation variational autoencoder bayesian nonparametric prior interpretable representation

Download PDF

Related papers

Inferring Multidimensional Rates of Aging from Cross-Sectional Data 2019

On the Interaction Effects Between Prediction and Clustering 2019

Efficient Linear Bandits through Matrix Sketching 2019

An Optimal Algorithm for Stochastic Three-Composite Optimization 2019

Efficient Inference in Multi-task Cox Process Models 2019