Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Jakob N. Foerster; Justin Gilmer; Jascha Sohl-dickstein; Jan Chorowski; David Sussillo

2017 ICML ICML 2017

Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Abstract

There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jakob N. Foerster , Justin Gilmer , Jascha Sohl-dickstein , Jan Chorowski , David Sussillo

Topics

Artificial Intelligence > Core AI > Interpretability Deep Learning > Architectures > Neural Networks

Keywords

linear methods text modeling recurrent neural network affine transformation

Download PDF

Related papers

Bottleneck Conditional Density Estimation 2017

Constrained Policy Optimization 2017

Near-Optimal Design of Experiments via Regret Minimization 2017

Input Convex Neural Networks 2017

An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation 2017