← Optimization & Theory

Deep Learning › Optimization & Theory ›

Interpretability

49 directly classified papers

Papers per year

Papers

LMO: Linear Mamba Operator for MRI Reconstruction CVPR 2025

TokenShapley: Token Level Context Attribution with Shapley Value ACL 2025

TRACE: Training and Inference-Time Interpretability Analysis for Language Models EMNLP 2025

Transformer Doctor: Diagnosing and Treating Vision Transformers NIPS 2024

Denoising Diffusion Path: Attribution Noise Reduction with An Auxiliary Diffusion Model NIPS 2024

Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance AAAI 2024

Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions CVPR 2024

Identifying Important Group of Pixels using Interactions CVPR 2024

G–LIME: Statistical Learning for Local Interpretations of Deep Neural Networks Using Global Priors (Abstract Reprint) AAAI 2024

Validation, Robustness, and Accuracy of Perturbation-Based Sensitivity Analysis Methods for Time-Series Deep Learning Models AAAI 2024

Representational Analysis of Binding in Language Models EMNLP 2024

Information Flow Routes: Automatically Interpreting Language Models at Scale EMNLP 2024

InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques NIPS 2024

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE) NIPS 2024

PICNN: A Pathway towards Interpretable Convolutional Neural Networks AAAI 2024

Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models CVPR 2024

Improving Interpretability via Explicit Word Interaction Graph Layer AAAI 2023

Testing the Channels of Convolutional Neural Networks AAAI 2023

Towards Better Visualizing the Decision Basis of Networks via Unfold and Conquer Attribution Guidance AAAI 2023

SmoothHess: ReLU Network Feature Interactions via Stein's Lemma NIPS 2023

This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations NIPS 2023

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models EMNLP 2023

DeepVisualInsight: Time-Travelling Visualization for Spatio-Temporal Causality of Deep Classification Training AAAI 2022

Is Attention Explanation? An Introduction to the Debate ACL 2022

Explaining Classes through Stable Word Attributions ACL 2022