2025 ICML ICML 2025

Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints