Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Interpretability
173 directly classified papers
Papers per year
2013: 1
2015: 1
2016: 2
2017: 1
2018: 6
2019: 9
2020: 24
2021: 31
2022: 34
2023: 15
2024: 20
2025: 29
Papers
Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning
NIPS 2021
Explicable Reward Design for Reinforcement Learning Agents
NIPS 2021
Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration
NIPS 2021
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
NIPS 2021
Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?
ACL 2020
Obtaining Faithful Interpretations from Compositional Neural Networks
ACL 2020
Towards Interpretable Natural Language Understanding with Explanations as Latent Variables
NIPS 2020
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses
NIPS 2020
PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks
NIPS 2020
Decisions, Counterfactual Explanations and Strategic Behavior
NIPS 2020
Discovering Symbolic Models from Deep Learning with Inductive Biases
NIPS 2020
Interpreting Twitter User Geolocation
ACL 2020
Understanding Attention for Text Classification
ACL 2020
Towards Interpretable Semantic Segmentation via Gradient-Weighted Class Activation Mapping (Student Abstract)
AAAI 2020
AI Trust in Business Processes: The Need for Process-Aware Explanations
AAAI 2020
Word-Level Contextual Sentiment Analysis with Interpretability
AAAI 2020
Interpretable and Differentially Private Predictions
AAAI 2020
Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks
AAAI 2020
Evaluating Attribution Methods using White-Box LSTMs
EMNLP 2020
Structured Self-Attention Weights Encode Semantics in Sentiment Analysis
EMNLP 2020
Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation
EMNLP 2020
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?
EMNLP 2020
Inserting Information Bottlenecks for Attribution in Transformers
EMNLP 2020
Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering
EMNLP 2020
Gradient-based Analysis of NLP Models is Manipulable
EMNLP 2020
<
1
2
3
4
5
6
7
>