Interpretable Tensor Fusion

Saurabh Varshneya; Antoine Ledent; Philipp Liznerski; Andriy Balinskyy; Purvanshi Mehta; Waleed Mustafa; Marius Kloft

2024 IJCAI IJCAI 2024

Interpretable Tensor Fusion

Abstract

Conventional machine learning methods are predominantly designed to predict outcomes based on a single data type. However, practical applications may encompass data of diverse types, such as text, images, and audio. We introduce interpretable tensor fusion (InTense), a multimodal learning method training a neural network to simultaneously learn multiple data representations and their interpretable fusion. InTense can separately capture both linear combinations and multiplicative interactions of the data types, thereby disentangling higher-order interactions from the individual effects of each modality. InTense provides interpretability out of the box by assigning relevance scores to modalities and their associations, respectively. The approach is theoretically grounded and yields meaningful relevance scores on multiple synthetic and real-world datasets. Experiments on four real-world datasets show that InTense outperforms existing state-of-the-art multimodal interpretable approaches in terms of accuracy and interpretability.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Saurabh Varshneya , Antoine Ledent , Philipp Liznerski , Andriy Balinskyy , Purvanshi Mehta , Waleed Mustafa , Marius Kloft

Topics

Artificial Intelligence > Core AI > Multimodal Learning

Keywords

multimodal learning feature fusion relevance score tensor fusion

Download PDF

Related papers

Langshaw: Declarative Interaction Protocols Based on Sayso and Conflict 2024

A Successful Strategy for Multichannel Iterated Prisoner’s Dilemma 2024

Bring Metric Functions into Diffusion Models 2024

Fast One-Stage Unsupervised Domain Adaptive Person Search 2024

FreqFormer: Frequency-aware Transformer for Lightweight Image Super-resolution 2024