2025 ICML ICML 2025

Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models