Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core Methods
Machine Learning
›
Core Methods
›
Interpretability
349 directly classified papers
Papers per year
2008: 1
2014: 1
2015: 2
2016: 4
2017: 4
2018: 10
2019: 29
2020: 41
2021: 40
2022: 65
2023: 55
2024: 56
2025: 41
Papers
Defining and Quantifying the Emergence of Sparse Concepts in DNNs
CVPR 2023
Trusted Fine-Grained Image Classification through Hierarchical Evidence Fusion
AAAI 2023
Experimental Observations of the Topology of Convolutional Neural Network Activations
AAAI 2023
Class based Influence Functions for Error Detection
ACL 2023
Explaining Random Forests Using Bipolar Argumentation and Markov Networks
AAAI 2023
Latent Autoregressive Source Separation
AAAI 2023
CREST: A Joint Framework for Rationalization and Counterfactual Text Generation
ACL 2023
The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
ACL 2023
Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks
AAAI 2023
Interpreting Unfairness in Graph Neural Networks via Training Node Attribution
AAAI 2023
Log-linear Guardedness and its Implications
ACL 2023
Efficient Shapley Values Estimation by Amortization for Text Classification
ACL 2023
Very Fast, Approximate Counterfactual Explanations for Decision Forests
AAAI 2023
Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model
ACL 2023
Explaining How Transformers Use Context to Build Predictions
ACL 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
ACL 2023
Unsupervised Selective Rationalization with Noise Injection
ACL 2023
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions
AAAI 2023
REV: Information-Theoretic Evaluation of Free-Text Rationales
ACL 2023
On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach
ACL 2023
Black-box language model explanation by context length probing
ACL 2023
Local Path Integration for Attribution
AAAI 2023
COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP
ACL 2023
Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods
ACL 2023
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers
AAAI 2023
<
1
…
5
6
7
…
14
>