2025 ICML ICML 2025

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training