Preconditioned Spectral Descent for Deep Learning

David E Carlson; Edo Collins; Ya-Ping Hsieh; Lawrence Carin; Volkan Cevher

2015 NIPS NeurIPS 2015

Preconditioned Spectral Descent for Deep Learning

Abstract

Deep learning presents notorious computational challenges. These challenges include, but are not limited to, the non-convexity of learning objectives and estimating the quantities needed for optimization algorithms, such as gradients. While we do not address the non-convexity, we present an optimization solution that ex- ploits the so far unused “geometry” in the objective function in order to best make use of the estimated gradients. Previous work attempted similar goals with preconditioned methods in the Euclidean space, such as L-BFGS, RMSprop, and ADA-grad. In stark contrast, our approach combines a non-Euclidean gradient method with preconditioning. We provide evidence that this combination more accurately captures the geometry of the objective function compared to prior work. We theoretically formalize our arguments and derive novel preconditioned non-Euclidean algorithms. The results are promising in both computational time and quality when applied to Restricted Boltzmann Machines, Feedforward Neural Nets, and Convolutional Neural Nets.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

📈 Trend Setter — Neural Network Optimization

🧭 Keyword Pioneer — preconditioned gradient descent

🐣 Hot Topic Early Bird — neural network training

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

David E Carlson , Edo Collins , Ya-Ping Hsieh , Lawrence Carin , Volkan Cevher

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Optimization & Theory > Optimization

Keywords

neural network training deep learning neural network optimization gradient descent preconditioned gradient descent non-euclidean optimization non-euclidean gradient spectral descent

Download PDF

Related papers

Data Generation as Sequential Decision Making 2015

A Recurrent Latent Variable Model for Sequential Data 2015

Combinatorial Cascading Bandits 2015

Accelerated Mirror Descent in Continuous and Discrete Time 2015

Matrix Completion with Noisy Side Information 2015