Stochastic Gradient Descent with Only One Projection

Mehrdad Mahdavi; Tianbao Yang; Rong Jin; Shenghuo Zhu; Jinfeng Yi

2012 NIPS NeurIPS 2012

Stochastic Gradient Descent with Only One Projection

Abstract

Although many variants of stochastic gradient descent have been proposed for large-scale convex optimization, most of them require projecting the solution at {\it each} iteration to ensure that the obtained solution stays within the feasible domain. For complex domains (e.g., positive semidefinite cone), the projection step can be computationally expensive, making stochastic gradient descent unattractive for large-scale optimization problems. We address this limitation by developing a novel stochastic gradient descent algorithm that does not need intermediate projections. Instead, only one projection at the last iteration is needed to obtain a feasible solution in the given domain. Our theoretical analysis shows that with a high probability, the proposed algorithms achieve an $O(1/\sqrt{T})$ convergence rate for general convex optimization, and an $O(\ln T/T)$ rate for strongly convex optimization under mild conditions about the domain and the objective function.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — projection-free optimization

🐣 Hot Topic Early Bird — stochastic gradient descent

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

📈 Trend Setter — Neural Network Optimization

Authors

Mehrdad Mahdavi , Tianbao Yang , Rong Jin , Shenghuo Zhu , Jinfeng Yi

Topics

Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Continuous Optimization Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Optimization & Theory > Stochastic Methods Mathematics & Optimization > Optimization > Optimization Deep Learning > Optimization & Theory > Neural Network Optimization Mathematics & Optimization > Optimization > Convex Optimization

Keywords

stochastic gradient descent large-scale optimization convex optimization strongly convex projection-free optimization online algorithm optimization algorithm convergence rate projection method

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012