2021 AAAI AAAI 2021

FracBits: Mixed Precision Quantization via Fractional Bit-Widths

Abstract

Abstract Model quantization helps to reduce model size and latency of deep neural networks. Mixed precision quantization is favorable with customized hardwares supporting arithmetic operations at multiple bit-widths to achieve maximum efficiency. We propose a novel learning-based algorithm to derive mixed precision models end-to-end under target computation constraints and model sizes. During the optimization, the bit-width of each layer / kernel in the model is at a fractional status of two consecutive bit-widths which can be adjusted gradually. With a differentiable regularization term, the resource constraints can be met during the quantization-aware training which results in an optimized mixed precision model. Our final models achieve comparable or better performance than previous quantization methods with mixed precision on MobilenetV1/V2, ResNet18 under different resource constraints on ImageNet dataset.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🧭 Keyword Pioneer — bit-width optimization
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors