2025
ICML
ICML 2025
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
Authors
Adam Karvonen
,
Can Rager
,
Johnny Lin
,
Curt Tigges
,
Joseph Isaac Bloom
,
David Chanin
,
Yeu-Tong Lau
,
Eoin Farrell
,
Callum Stuart McDougall
,
Kola Ayonrinde
,
Demian Till
,
Matthew Wearden
,
Arthur Conmy
,
Samuel Marks
,
Neel Nanda