Open Problem: Tight Convergence of SGD in Constant Dimension

Tomer Koren; Shahar Segal

2020 COLT COLT 2020

Open Problem: Tight Convergence of SGD in Constant Dimension

Abstract

Stochastic Gradient Descent (SGD) is one of the most popular optimization methods in machine learning and has been studied extensively since the early 50’s. However, our understanding of this fundamental algorithm is still lacking in certain aspects. We point out to a gap that remains between the known upper and lower bounds for the expected suboptimality of the last SGD point whenever the dimension is a constant independent of the number of SGD iterations $T$, and in particular, that the gap is still unaddressed even in the one dimensional case. For the latter, we provide evidence that the correct rate is $\Theta(1/\sqrt{T})$ and conjecture that the same applies in any (constant) dimension.

🧭 Keyword Pioneer — constant dimension

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tomer Koren , Shahar Segal

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization

Keywords

learning theory stochastic gradient descent convergence rate suboptimality bound constant dimension

Download PDF

Related papers

Open Problem: Average-Case Hardness of Hypergraphic Planted Clique Detection 2020

Highly smooth minimization of non-smooth problems 2020

Closure Properties for Private Classification and Online Prediction 2020

Efficient, Noise-Tolerant, and Private Learning via Boosting 2020

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit 2020