2025 ICML ICML 2025

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability