2023
ACL
ACL 2023
HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language
Abstract
AbstractThis paper presents “HaVQA”, the first multimodal dataset for visual question answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fashion that guarantees their semantic match with the corresponding visual information. We conducted several baseline experiments on the dataset, including visual question answering, visual question elicitation, text-only and multimodal machine translation.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Natural Language Processing
🧭
Keyword Pioneer
— hausa language
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Natural Language Processing > Applications > Question Answering
Natural Language Processing > Resources & Methods > Multilingual NLP
Machine Learning > Learning Types > Multi-Modal Learning
Natural Language Processing > Applications > Visual Question Answering
Deep Learning > Learning Types > Multi-Modal Learning
Computer Vision > Generation > Visual Question Answering