Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

Emily L Denton; Wojciech Zaremba; Joan Bruna; Yann LeCun; Rob Fergus

2014 NIPS NeurIPS 2014

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

Abstract

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2×, while keeping the accuracy within 1% of the original model.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

📈 Trend Setter — Model Compression

🧭 Keyword Pioneer — filter redundancy

🐣 Hot Topic Early Bird — model compression

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Emily L Denton , Wojciech Zaremba , Joan Bruna , Yann LeCun , Rob Fergus

Topics

Artificial Intelligence > Core AI > Model Compression Machine Learning > Application Areas > Efficient Computing Deep Learning > Optimization & Theory > Model Compression Deep Learning > Optimization & Theory > Efficient Computing Deep Learning > Architectures > Convolutional Neural Networks

Keywords

model compression neural network optimization efficient computing convolutional neural network convolutional network filter redundancy filter approximation

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014