2025 ICML ICML 2025

RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression