2019 COLT COLT 2019

Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT

Abstract

We provide non-asymptotic convergence rates of the Polyak-Ruppert averaged stochastic gradient descent (SGD) to a normal random vector for a class of twice-differentiable test functions. A crucial intermediate step is proving a non-asymptotic martingale central limit theorem (CLT), i.e., establishing the rates of convergence of a multivariate martingale difference sequence to a normal random vector, which might be of independent interest. We obtain the explicit rates for the multivariate martingale CLT using a combination of Stein?s method and Lindeberg?s argument, which is then used in conjunction with a non-asymptotic analysis of averaged SGD proposed in [PJ92]. Our results have potentially interesting consequences for computing confidence intervals for parameter estimation with SGD and constructing hypothesis tests with SGD that are valid in a non-asymptotic sense

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
🐣 Hot Topic Early Bird — confidence interval
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio