Leveraging Sparse Linear Layers for Debuggable Deep Networks

Eric Wong; Shibani Santurkar; Aleksander Madry

2021 ICML ICML 2021

Leveraging Sparse Linear Layers for Debuggable Deep Networks

Abstract

We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantitatively and via human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Eric Wong , Shibani Santurkar , Aleksander Madry

Topics

Artificial Intelligence > Core AI > Interpretability Artificial Intelligence > Core AI > Model Compression Deep Learning > Techniques > Model Architecture Machine Learning > Application Areas > Model Compression Machine Learning > Core Methods > Model Compression

Keywords

feature attribution feature representation model debugging sparse model sparse linear model spurious correlation neural network

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021