← Core Methods

Machine Learning › Core Methods ›

Interpretability

349 directly classified papers

Papers per year

Papers

Defining and Quantifying the Emergence of Sparse Concepts in DNNs CVPR 2023

Trusted Fine-Grained Image Classification through Hierarchical Evidence Fusion AAAI 2023

Experimental Observations of the Topology of Convolutional Neural Network Activations AAAI 2023

Class based Influence Functions for Error Detection ACL 2023

Explaining Random Forests Using Bipolar Argumentation and Markov Networks AAAI 2023

Latent Autoregressive Source Separation AAAI 2023

CREST: A Joint Framework for Rationalization and Counterfactual Text Generation ACL 2023

The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources ACL 2023

Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks AAAI 2023

Interpreting Unfairness in Graph Neural Networks via Training Node Attribution AAAI 2023

Log-linear Guardedness and its Implications ACL 2023

Efficient Shapley Values Estimation by Amortization for Text Classification ACL 2023

Very Fast, Approximate Counterfactual Explanations for Decision Forests AAAI 2023

Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model ACL 2023

Explaining How Transformers Use Context to Build Predictions ACL 2023

Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions ACL 2023

Unsupervised Selective Rationalization with Noise Injection ACL 2023

Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions AAAI 2023

REV: Information-Theoretic Evaluation of Free-Text Rationales ACL 2023

On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach ACL 2023

Black-box language model explanation by context length probing ACL 2023

Local Path Integration for Attribution AAAI 2023

COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP ACL 2023

Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods ACL 2023

Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers AAAI 2023