2025 ICML ICML 2025

Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies