Multimodal Logical Inference System for Visual-Textual Entailment

Riko Suzuki; Hitomi Yanaka; Masashi Yoshikawa; Koji Mineshima; Daisuke Bekki

2019 ACL ACL 2019

Multimodal Logical Inference System for Visual-Textual Entailment

Abstract

AbstractA large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations. In this paper, we use logic-based representations as unified meaning representations for texts and images and present an unsupervised multimodal logical inference system that can effectively prove entailment relations between them. We show that by combining semantic parsing and theorem proving, the system can handle semantically complex sentences for visual-textual inference.

🌉 Interdisciplinary Bridge — Deep Learning and Knowledge & Reasoning and Natural Language Processing

🧭 Keyword Pioneer — multimodal inference

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Riko Suzuki , Hitomi Yanaka , Masashi Yoshikawa , Koji Mineshima , Daisuke Bekki

Topics

Natural Language Processing > Resources & Methods > Natural Language Inference Knowledge & Reasoning > Reasoning > Automated Reasoning Natural Language Processing > Applications > Natural Language Inference Deep Learning > Learning Types > Multi-Modal Learning

Keywords

theorem proving semantic parsing multimodal learning visual reasoning meaning representation multimodal inference logical inference visual-textual entailment multimodal logical inference

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019