STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering

Hrituraj Singh; Sumit Shekhar

2020 EMNLP EMNLP 2020

STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering

Abstract

AbstractChart Question Answering (CQA) is the task of answering natural language questions about visualisations in the chart image. Recent solutions, inspired by VQA approaches, rely on image-based attention for question/answering while ignoring the inherent chart structure. We propose STL-CQA which improves the question/answering through sequential elements localization, question encoding and then, a structural transformer-based learning approach. We conduct extensive experiments while proposing pre-training tasks, methodology and also an improved dataset with more complex and balanced questions of different types. The proposed methodology shows a significant accuracy improvement compared to the state-of-the-art approaches on various chart Q/A datasets, while outperforming even human baseline on the DVQA Dataset. We also demonstrate interpretability while examining different components in the inference pipeline.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — chart understanding

🐣 Hot Topic Early Bird — image-text alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Hrituraj Singh , Sumit Shekhar

Topics

Artificial Intelligence > Core AI > Multimodal Learning Natural Language Processing > Applications > Question Answering Computer Vision > Core AI > Multimodal Learning Deep Learning > Models > Transformers Artificial Intelligence > Core AI > Multi-Modal Learning Computer Vision > Generation > Visual Question Answering

Keywords

visual question answering structural learning image-text alignment chart understanding chart question answering image attention structural transformer structural chart parsing

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020