SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization

Shijie Cao; Lingxiao Ma; Wencong Xiao; Chen Zhang; Yunxin Liu; Lintao Zhang; Lanshun Nie; Zhi Yang

2019 CVPR CVPR 2019

SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization

Abstract

In this paper we present a novel and general method to accelerate convolutional neural network (CNN) inference by taking advantage of feature map sparsity. We experimentally demonstrate that a highly quantized version of the original network is sufficient in predicting the output sparsity accurately, and verify that leveraging such sparsity in inference incurs negligible accuracy drop compared with the original network. To accelerate inference, for each convolution layer our approach first obtains a binary sparsity mask of the output feature maps by running inference on a quantized version of the original network layer, and then conducts a full-precision sparse convolution to find out the precise values of the non-zero outputs. Compared with existing work, our approach avoids the overhead of training additional auxiliary networks, while is still applicable to general CNN networks without being limited to certain application domains.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

📈 Trend Setter — Efficient Computing

🧭 Keyword Pioneer — feature map sparsity

🐣 Hot Topic Early Bird — inference acceleration

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shijie Cao , Lingxiao Ma , Wencong Xiao , Chen Zhang , Yunxin Liu , Lintao Zhang , Lanshun Nie , Zhi Yang

Topics

Artificial Intelligence > Core AI > Model Compression Machine Learning > Application Areas > Efficient Computing Deep Learning > Optimization & Theory > Model Compression Computer Vision > Core AI > Efficient Computing Deep Learning > Optimization & Theory > Efficient Computing Deep Learning > Learning Types > Model Compression

Keywords

model compression convolutional neural network inference acceleration low-bit quantization feature map sparsity sparsity prediction

Download PDF

Related papers

Fast Single Image Reflection Suppression via Convex Optimization 2019

Learning Video Representations From Correspondence Proposals 2019

ATOM: Accurate Tracking by Overlap Maximization 2019

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters 2019

Edge-Labeling Graph Neural Network for Few-Shot Learning 2019