Variance-Reduced and Projection-Free Stochastic Optimization

Elad Hazan; Haipeng Luo

2016 ICML ICML 2016

Variance-Reduced and Projection-Free Stochastic Optimization

Abstract

The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints. However, in the stochastic learning setting, it is still relatively understudied compared to the gradient descent counterpart. In this work, leveraging a recent variance reduction technique, we propose two stochastic Frank-Wolfe variants which substantially improve previous results in terms of the number of stochastic gradient evaluations needed to achieve 1-εaccuracy. For example, we improve from O(\frac1ε) to O(\ln\frac1ε) if the objective function is smooth and strongly convex, and from O(\frac1ε^2) to O(\frac1ε^1.5) if the objective function is smooth and Lipschitz. The theoretical improvement is also observed in experiments on real-world datasets for a multiclass classification application.

🐣 Hot Topic Early Bird — stochastic optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Speech & Audio

Authors

Elad Hazan , Haipeng Luo

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Stochastic Methods

Keywords

stochastic optimization convex optimization multiclass classification variance reduction projection-free optimization frank-wolfe algorithm

Download PDF

Related papers

Associative Long Short-Term Memory 2016

Recycling Randomness with Structure for Sublinear time Kernel Expansions 2016

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues 2016

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization 2016

Hawkes Processes with Stochastic Excitations 2016