Interpretable Neural Predictions with Differentiable Binary Variables

Jasmijn Bastings; Wilker Aziz; Ivan Titov

2019 ACL ACL 2019

Interpretable Neural Predictions with Differentiable Binary Variables

Abstract

AbstractThe success of neural networks comes hand in hand with a desire for more interpretability. We focus on text classifiers and make them more interpretable by having them provide a justification–a rationale–for their predictions. We approach this problem by jointly training two neural network models: a latent model that selects a rationale (i.e. a short and informative part of the input text), and a classifier that learns from the words in the rationale alone. Previous work proposed to assign binary latent masks to input positions and to promote short selections via sparsity-inducing penalties such as L0 regularisation. We propose a latent model that mixes discrete and continuous behaviour allowing at the same time for binary selections and gradient-based training without REINFORCE. In our formulation, we can tractably compute the expected value of penalties such as L0, which allows us to directly optimise the model towards a pre-specified text selection rate. We show that our approach is competitive with previous work on rationale extraction, and explore further uses in attention mechanisms.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — binary latent mask

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jasmijn Bastings , Wilker Aziz , Ivan Titov

Topics

Artificial Intelligence > Core AI > Interpretability Deep Learning > Models > Variational Inference Natural Language Processing > Applications > Text Classification Deep Learning > Learning Types > Representation Learning Deep Learning > Techniques > Attention Artificial Intelligence > Core AI > Natural Language Processing

Keywords

attention mechanism rationale extraction l0 regularization sparsity-inducing penalty text classifier binary latent mask differentiable binary variable interpretable neural prediction

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019