StopWasting My Gradients: Practical SVRG

Reza Babanezhad Harikandeh; Mohamed Osama Ahmed; Alim Virani; Mark Schmidt; Jakub Konečný; Scott Sallinen

2015 NIPS NeurIPS 2015

StopWasting My Gradients: Practical SVRG

Abstract

We present and analyze several strategies for improving the performance ofstochastic variance-reduced gradient (SVRG) methods. We first show that theconvergence rate of these methods can be preserved under a decreasing sequenceof errors in the control variate, and use this to derive variants of SVRG that usegrowing-batch strategies to reduce the number of gradient calculations requiredin the early iterations. We further (i) show how to exploit support vectors to reducethe number of gradient computations in the later iterations, (ii) prove that thecommonly–used regularized SVRG iteration is justified and improves the convergencerate, (iii) consider alternate mini-batch selection strategies, and (iv) considerthe generalization error of the method.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization

📈 Trend Setter — Stochastic Methods

🧭 Keyword Pioneer — stochastic variance-reduced gradient

🐣 Hot Topic Early Bird — stochastic gradient

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Reza Babanezhad Harikandeh , Mohamed Osama Ahmed , Alim Virani , Mark Schmidt , Jakub Konečný , Scott Sallinen

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Stochastic Methods Deep Learning > Optimization & Theory > Stochastic Methods

Keywords

stochastic gradient gradient descent variance reduction support vector machine stochastic variance-reduced gradient

Download PDF

Related papers

Data Generation as Sequential Decision Making 2015

A Recurrent Latent Variable Model for Sequential Data 2015

Combinatorial Cascading Bandits 2015

Accelerated Mirror Descent in Continuous and Discrete Time 2015

Matrix Completion with Noisy Side Information 2015