Learning Stochastic Feedforward Neural Networks

Charlie Tang; Ruslan Salakhutdinov

2013 NIPS NeurIPS 2013

Learning Stochastic Feedforward Neural Networks

Abstract

Multilayer perceptrons (MLPs) or neural networks are popular models used for nonlinear regression and classification tasks. As regressors, MLPs model the conditional distribution of the predictor variables Y given the input variables X. However, this predictive distribution is assumed to be unimodal (e.g. Gaussian). For tasks such as structured prediction problems, the conditional distribution should be multimodal, forming one-to-many mappings. By using stochastic hidden variables rather than deterministic ones, Sigmoid Belief Nets (SBNs) can induce a rich multimodal distribution in the output space. However, previously proposed learning algorithms for SBNs are very slow and do not work well for real-valued data. In this paper, we propose a stochastic feedforward network with hidden layers having \emph{both deterministic and stochastic} variables. A new Generalized EM training procedure using importance sampling allows us to efficiently learn complicated conditional distributions. We demonstrate the superiority of our model to conditional Restricted Boltzmann Machines and Mixture Density Networks on synthetic datasets and on modeling facial expressions. Moreover, we show that latent features of our model improves classification and provide additional qualitative results on color images.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — sigmoid belief nets

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Self-Supervised Learning

🐣 Hot Topic Early Bird — structured prediction

Authors

Charlie Tang , Ruslan Salakhutdinov

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Regression Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Architectures > Neural Networks Deep Learning > Models > Generative Models Deep Learning > Learning Types > Self-Supervised Learning Deep Learning > Models > Neural Networks Deep Learning > Learning Types > Deep Learning

Keywords

structured prediction importance sampling latent feature learning conditional distribution sigmoid belief nets generalized expectation maximization multimodal distribution multilayer perceptron sigmoid belief network stochastic neural network stochastic feedforward network

Download PDF

Related papers

Latent Structured Active Learning 2013

On Flat versus Hierarchical Classification in Large-Scale Taxonomies 2013

Generalized Method-of-Moments for Rank Aggregation 2013

Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections 2013

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent 2013