Textbook Question Answering Under Instructor Guidance With Memory Networks

Juzheng Li; Hang Su; Jun Zhu; Siyu Wang; Bo Zhang

2018 CVPR CVPR 2018

Textbook Question Answering Under Instructor Guidance With Memory Networks

Abstract

Textbook Question Answering (TQA) is a task to choose the most proper answers by reading a multi-modal context of abundant essays and images. TQA serves as a favorable test bed for visual and textual reasoning. However, most of the current methods are incapable of reasoning over the long contexts and images. To address this issue, we propose a novel approach of Instructor Guidance with Memory Networks (IGMN) which conducts the TQA task by finding contradictions between the candidate answers and their corresponding context. We build the Contradiction Entity-Relationship Graph (CERG) to extend the passage-level multi-modal contradictions to an essay level. The machine thus performs as an instructor to extract the essay-level contradictions as the Guidance. Afterwards, we exploit the memory networks to capture the information in the Guidance, and use the attention mechanisms to jointly reason over the global features of the multi-modal input. Extensive experiments demonstrate that our method outperforms the state-of-the-arts on the TQA dataset. The source code is available at https://github.com/freerailway/igmn.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

📈 Trend Setter — Question Answering

🧭 Keyword Pioneer — textbook question answering

🐣 Hot Topic Early Bird — visual reasoning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Juzheng Li , Hang Su , Jun Zhu , Siyu Wang , Bo Zhang

Topics

Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Applications > Machine Reading Comprehension Natural Language Processing > Applications > Visual Question Answering Artificial Intelligence > Core AI > Multi-Modal Learning Artificial Intelligence > Core AI > Question Answering

Keywords

visual question answering attention mechanism multi-modal learning visual reasoning memory network textbook question answering

Download PDF

Related papers

Multi-Shot Pedestrian Re-Identification via Sequential Decision Making 2018

Multi-Cue Correlation Filters for Robust Visual Tracking 2018

Pointwise Convolutional Neural Networks 2018

Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking 2018

Image Generation From Scene Graphs 2018