Optimal and Approximate Adaptive Stochastic Quantization

Yaniv Ben-Itzhak; Michael Mitzenmacher; Shay Vargaftik; Ran Ben-Basat; Ran Ben Basat

2024 NIPS NeurIPS 2024

Optimal and Approximate Adaptive Stochastic Quantization

Abstract

Quantization is a fundamental optimization for many machine learning (ML) use cases, including compressing gradients, model weights and activations, and datasets. The most accurate form of quantization is adaptive, where the error is minimized with respect to a given input rather than optimizing for the worst case. However, optimal adaptive quantization methods are considered infeasible in terms of both their runtime and memory requirements.We revisit the Adaptive Stochastic Quantization (ASQ) problem and present algorithms that find optimal solutions with asymptotically improved time and space complexities. Our experiments indicate that our algorithms may open the door to using ASQ more extensively in a variety of ML applications. We also present an even faster approximation algorithm for quantizing large inputs on the fly.

🧭 Keyword Pioneer — runtime complexity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

Authors

Ran Ben-Basat , Ran Ben Basat , Yaniv Ben-Itzhak , Michael Mitzenmacher , Shay Vargaftik

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Efficient Computing Machine Learning > Application Areas > Model Compression Machine Learning > Core Methods > Optimization Deep Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Efficient Computing Machine Learning > Learning Types > Optimization

Keywords

model compression memory complexity optimization algorithm gradient compression stochastic quantization runtime complexity adaptive quantization

Download PDF

Related papers

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers 2024

Training for Stable Explanation for Free 2024

NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks 2024

Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch 2024

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence 2024