← Learning Types

Machine Learning › Learning Types ›

Interpretability

173 directly classified papers

Papers per year

Papers

Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning NIPS 2021

Explicable Reward Design for Reinforcement Learning Agents NIPS 2021

Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration NIPS 2021

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores NIPS 2021

Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? ACL 2020

Obtaining Faithful Interpretations from Compositional Neural Networks ACL 2020

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables NIPS 2020

Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses NIPS 2020

PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks NIPS 2020

Decisions, Counterfactual Explanations and Strategic Behavior NIPS 2020

Discovering Symbolic Models from Deep Learning with Inductive Biases NIPS 2020

Interpreting Twitter User Geolocation ACL 2020

Understanding Attention for Text Classification ACL 2020

Towards Interpretable Semantic Segmentation via Gradient-Weighted Class Activation Mapping (Student Abstract) AAAI 2020

AI Trust in Business Processes: The Need for Process-Aware Explanations AAAI 2020

Word-Level Contextual Sentiment Analysis with Interpretability AAAI 2020

Interpretable and Differentially Private Predictions AAAI 2020

Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks AAAI 2020

Evaluating Attribution Methods using White-Box LSTMs EMNLP 2020

Structured Self-Attention Weights Encode Semantics in Sentiment Analysis EMNLP 2020

Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation EMNLP 2020

Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? EMNLP 2020

Inserting Information Bottlenecks for Attribution in Transformers EMNLP 2020

Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering EMNLP 2020

Gradient-based Analysis of NLP Models is Manipulable EMNLP 2020