Multi-Precision Quantized Neural Networks via Encoding Decomposition of {-1,+1}

Qigong Sun; Fanhua Shang; Kang Yang; Xiufang Li; Yan Ren; Licheng Jiao

2019 AAAI AAAI 2019

Multi-Precision Quantized Neural Networks via Encoding Decomposition of {-1,+1}

Abstract

Abstract The training of deep neural networks (DNNs) requires intensive resources both for computation and for storage performance. Thus, DNNs cannot be efficiently applied to mobile phones and embedded devices, which seriously limits their applicability in industry applications. To address this issue, we propose a novel encoding scheme of using {−1, +1} to decompose quantized neural networks (QNNs) into multibranch binary networks, which can be efficiently implemented by bitwise operations (xnor and bitcount) to achieve model compression, computational acceleration and resource saving. Based on our method, users can easily achieve different encoding precisions arbitrarily according to their requirements and hardware resources. The proposed mechanism is very suitable for the use of FPGA and ASIC in terms of data storage and computation, which provides a feasible idea for smart chips. We validate the effectiveness of our method on both large-scale image classification tasks (e.g., ImageNet) and object detection tasks. In particular, our method with lowbit encoding can still achieve almost the same performance as its full-precision counterparts.

🚀 Conference Pioneer — AAAI 2019

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — model acceleration

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Qigong Sun , Fanhua Shang , Kang Yang , Xiufang Li , Yan Ren , Licheng Jiao

Topics

Machine Learning > Application Areas > Efficient Computing Deep Learning > Techniques > Model Architecture Artificial Intelligence > Core AI > Efficient Computing Deep Learning > Optimization & Theory > Model Compression

Keywords

neural network quantization model compression model acceleration binary neural network bitwise operation

Download PDF

Related papers

Cooperative Multimodal Approach to Depression Detection in Twitter 2019

Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks 2019

Community Detection in Social Networks Considering Topic Correlations 2019

Session-Based Recommendation with Graph Neural Networks 2019

Blameworthiness in Multi-Agent Settings 2019