SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent

Antoine Bordes; Léon Bottou; Patrick Gallinari

2009 JMLR JMLR 2009

SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent

Abstract

The SGD-QN algorithm is a stochastic gradient descent algorithm that makes careful use of second-order information and splits the parameter update into independently scheduled components. Thanks to this design, SGD-QN iterates nearly as fast as a first-order stochastic gradient descent but requires less iterations to achieve the same accuracy. This algorithm won the "Wild Track" of the first PASCAL Large Scale Learning Challenge (Sonnenburg et al., 2008). [abs] [ pdf ][ bib ] © JMLR 2009. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

📈 Trend Setter — Neural Network Optimization

🧭 Keyword Pioneer — second-order information

🐣 Hot Topic Early Bird — stochastic gradient descent

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

Authors

Antoine Bordes , Léon Bottou , Patrick Gallinari

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

stochastic gradient descent large scale learning second-order optimization quasi-newton method second-order information parameter update linear rate

Download PDF

Related papers

Subgroup Analysis via Recursive Partitioning 2009

A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization 2009

An Analysis of Convex Relaxations for MAP Estimation of Discrete MRFs 2009

Nonextensive Information Theoretic Kernels on Measures 2009

The Hidden Life of Latent Variables: Bayesian Learning with Mixed Graph Models 2009