Convergence diagnostics for stochastic gradient descent with constant learning rate

Jerry Chee; Panos Toulis

2018 AISTATS AISTATS 2018

Convergence diagnostics for stochastic gradient descent with constant learning rate

Abstract

Many iterative procedures in stochastic optimization exhibit a transient phase followed by a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in that region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transition in the context of stochastic gradient descent with constant learning rate. We present theory and experiments suggesting that the region where the proposed diagnostic is activated coincides with the convergence region. For a class of loss functions, we derive a closed-form solution describing such region. Finally, we suggest an application to speed up convergence of stochastic gradient descent by halving the learning rate each time stationarity is detected. This leads to a new variant of stochastic gradient descent, which in many settings is comparable to state-of-art.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — convergence diagnostic

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jerry Chee , Panos Toulis

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

stochastic gradient descent convergence diagnostic constant learning rate transient phase stationary phase

Download PDF

Related papers

The Geometry of Random Features 2018

A Fast Algorithm for Separated Sparsity via Perturbed Lagrangians 2018

Regional Multi-Armed Bandits 2018

Group Invariance Principles for Causal Generative Models 2018

Stochastic Three-Composite Convex Minimization with a Linear Operator 2018