Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

Francesca Mignacco; Florent Krzakala; Pierfrancesco Urbani; Lenka Zdeborová

2020 NIPS NeurIPS 2020

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

Abstract

We analyze in a closed form the learning dynamics of stochastic gradient descent (SGD) for a single layer neural network classifying a high-dimensional Gaussian mixture where each cluster is assigned one of two labels. This problem provides a prototype of a non-convex loss landscape with interpolating regimes and a large generalization gap. We define a particular stochastic process for which SGD can be extended to a continuous-time limit that we call stochastic gradient flow. In the full-batch limit we recover the standard gradient flow. We apply dynamical mean-field theory from statistical physics to track the dynamics of the algorithm in the high-dimensional limit via a self-consistent stochastic process. We explore the performance of the algorithm as a function of control parameters shedding light on how it navigates the loss landscape.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — dynamical mean-field theory

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Francesca Mignacco , Florent Krzakala , Pierfrancesco Urbani , Lenka Zdeborová

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Learning Types > Deep Learning Deep Learning > Optimization & Theory > Theory

Keywords

stochastic gradient descent neural network training loss landscape high-dimensional analysis learning dynamics gaussian mixture neural network dynamical mean-field theory gaussian mixture classification

Download PDF

Related papers

Higher-Order Spectral Clustering of Directed Graphs 2020

Self-Supervised MultiModal Versatile Networks 2020

Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates 2020

Causal Intervention for Weakly-Supervised Semantic Segmentation 2020

Taming Discrete Integration via the Boon of Dimensionality 2020