2025 ICML ICML 2025

Maximizing Intermediate Checkpoint Value in LLM Pretraining with Bayesian Optimization