Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Interpretability
49 directly classified papers
Papers per year
2018: 5
2019: 5
2020: 5
2021: 8
2022: 4
2023: 6
2024: 13
2025: 3
Papers
Tracing and Manipulating intermediate values in Neural Math Problem Solvers
EMNLP 2022
Adaptive Consistency Prior Based Deep Network for Image Denoising
CVPR 2021
Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning
ACL 2021
Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation
ACL 2021
Integrated Directional Gradients: Feature Interaction Attribution for Neural NLP Models
ACL 2021
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation
AAAI 2021
Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations
AAAI 2021
HyDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks
AAAI 2021
Neural Response Interpretation Through the Lens of Critical Pathways
CVPR 2021
Inserting Information Bottlenecks for Attribution in Transformers
EMNLP 2020
Benchmarking Deep Learning Interpretability in Time Series Predictions
NIPS 2020
Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors
CVPR 2020
Explaining Knowledge Distillation by Quantifying the Knowledge
CVPR 2020
Analyzing Individual Neurons in Pre-trained Language Models
EMNLP 2020
Classifier-Agnostic Saliency Map Extraction
AAAI 2019
From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction
NIPS 2019
Attention is not not Explanation
EMNLP 2019
NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks
AAAI 2019
Interpreting Deep Models for Text Analysis via Optimization and Regularization Methods
AAAI 2019
Interpretable Neural Architectures for Attributing an Ad’s Performance to its Writing Style
EMNLP 2018
Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections
NIPS 2018
LISA: Explaining Recurrent Neural Network Judgments via Layer-wIse Semantic Accumulation and Example to Pattern Transformation
EMNLP 2018
Learning Explanations from Language Data
EMNLP 2018
Net2Vec: Quantifying and Explaining How Concepts Are Encoded by Filters in Deep Neural Networks
CVPR 2018
<
1
2
>