Avoiding pathologies in very deep networks

David Duvenaud; Oren Rippel; Ryan Adams; Zoubin Ghahramani

2014 AISTATS AISTATS 2014

Avoiding pathologies in very deep networks

Abstract

Choosing appropriate architectures and regularization strategies of deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — bayesian deep learning

🐣 Hot Topic Early Bird — neural network architecture

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

David Duvenaud , Oren Rippel , Ryan Adams , Zoubin Ghahramani

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Neural Networks Deep Learning > Learning Types > Deep Learning Machine Learning > Bayesian & Probabilistic > Gaussian Processes

Keywords

neural network architecture representation learning network architecture bayesian deep learning deep gaussian process function composition neural network

Download PDF

Related papers

Improved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes 2014

PAC-Bayesian Theory for Transductive Learning 2014

Sparse Bayesian Variable Selection for the Identification of Antigenic Variability in the Foot-and-Mouth Disease Virus 2014

Analytic Long-Term Forecasting with Periodic Gaussian Processes 2014

Exploiting the Limits of Structure Learning via Inherent Symmetry 2014