2019
ACL
ACL 2019
A Corpus for Reasoning about Natural Language Grounded in Photographs
Abstract
AbstractWe introduce a new dataset for joint reasoning about natural language and images, with a focus on semantic diversity, compositionality, and visual reasoning challenges. The data contains 107,292 examples of English sentences paired with web photographs. The task is to determine whether a natural language caption is true about a pair of photographs. We crowdsource the data using sets of visually rich images and a compare-and-contrast task to elicit linguistically diverse language. Qualitative analysis shows the data requires compositional joint reasoning, including about quantities, comparisons, and relations. Evaluation using state-of-the-art visual reasoning methods shows the data presents a strong challenge.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning and Natural Language Processing
📈
Trend Setter
— Machine Reading Comprehension
🧭
Keyword Pioneer
— joint reasoning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Deep Learning, Machine Learning, Natural Language Processing, Reinforcement Learning, Robotics
🐣
Hot Topic Early Bird
— compositional reasoning
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Machine Learning > Learning Types > Weakly Supervised Learning
Natural Language Processing > Applications > Machine Reading Comprehension
Natural Language Processing > Applications > Natural Language Inference
Machine Learning > Learning Types > Multi-Modal Learning
Computer Vision > Analysis > Video Understanding
Deep Learning > Learning Types > Multi-Modal Learning
Computer Vision > Analysis > Visual Question Answering