2025 ICML ICML 2025

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation