2025 ICML ICML 2025

MATH-Perturb: Benchmarking LLMs’ Math Reasoning Abilities against Hard Perturbations