Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF

Jayneel Parekh; Sanjeel Parekh; Pavlo Mozharovskyi; Florence d'Alché-Buc; Gaël RICHARD

2022 NIPS NeurIPS 2022

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF

Abstract

This paper tackles post-hoc interpretability for audio processing networks. Our goal is to interpret decisions of a trained network in terms of high-level audio objects that are also listenable for the end-user. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, a regularized interpreter module is trained to take hidden layer representations of the targeted network as input and produce time activations of pre-learnt NMF components as intermediate outputs. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on popular benchmarks, including a real-world multi-label classification task.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — audio network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Jayneel Parekh , Sanjeel Parekh , Pavlo Mozharovskyi , Florence d'Alché-Buc , Gaël RICHARD

Topics

Artificial Intelligence > Core AI > Interpretability Speech & Audio > Analysis > Speech Analysis Deep Learning > Optimization & Theory > Theory Machine Learning > Core Methods > Interpretability

Keywords

nonnegative matrix factorization audio classification post-hoc explanation non-negative matrix factorization post-hoc interpretability audio network

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022