Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives

Zeyuan Allen-Zhu; Yang Yuan

2016 ICML ICML 2016

Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives

Abstract

Many classical algorithms are found until several years later to outlive the confines in which they were conceived, and continue to be relevant in unforeseen settings. In this paper, we show that SVRG is one such method: being originally designed for strongly convex objectives, it is also very robust in non-strongly convex or sum-of-non-convex settings. More precisely, we provide new analysis to improve the state-of-the-art running times in both settings by either applying SVRG or its novel variant. Since non-strongly convex objectives include important examples such as Lasso or logistic regression, and sum-of-non-convex objectives include famous examples such as stochastic PCA and is even believed to be related to training deep neural nets, our results also imply better performances in these applications.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — stochastic variance reduced gradient

🐣 Hot Topic Early Bird — non-convex optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy

Authors

Zeyuan Allen-Zhu , Yang Yuan

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

stochastic optimization neural network training non-convex optimization logistic regression convergence analysis stochastic variance reduced gradient non-strongly-convex optimization

Download PDF

Related papers

Associative Long Short-Term Memory 2016

Recycling Randomness with Structure for Sublinear time Kernel Expansions 2016

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues 2016

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization 2016

Hawkes Processes with Stochastic Excitations 2016