2018 AISTATS AISTATS 2018

The emergence of spectral universality in deep networks

Abstract

Recent work has shown that tight concentration of the entire spectrum of singular values of a deep network’s input-output Jacobian around one at initialization can speed up learning by orders of magnitude. Therefore, to guide important design choices, it is important to build a full theoretical understanding of the spectra of Jacobians at initialization. To this end, we leverage powerful tools from free probability theory to provide a detailed analytic understanding of how a deep network’s Jacobian spectrum depends on various hyperparameters including the nonlinearity, the weight and bias distributions, and the depth. For a variety of nonlinearities, our work reveals the emergence of new universal limiting spectral distributions that remain concentrated around one even as the depth goes to infinity.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — spectral universality
🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Machine Learning, Speech & Audio
🐣 Hot Topic Early Bird — deep network