TA-Student VQA: Multi-Agents Training by Self-Questioning

Peixi Xiong; Ying Wu

2020 CVPR CVPR 2020

TA-Student VQA: Multi-Agents Training by Self-Questioning

Abstract

There are two main challenges in Visual Question Answering (VQA). The first one is that each model obtains its strengths and shortcomings when applied to several questions; what is more, the "ceiling effect" for specific questions is difficult to overcome with simple consecutive training. The second challenge is that even the state-of-the-art dataset is of large scale, questions targeted at a single image are off in format and lack diversity in content. We introduce our self-questioning model with multi-agent training: TA-student VQA. This framework differs from standard VQA algorithms by involving question-generating mechanisms and collaborative learning questions between question-answering agents. Thus, TA-student VQA overcomes the limitation of the content diversity and format variation of questions and improves the overall performance of multiple question-answering agents. We evaluate our model on VQA-v2, which outperforms algorithms without such mechanisms. In addition, TA-student VQA achieves a greater model capacity, allowing it to answer more generated questions in addition to those in the annotated datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning and Natural Language Processing

📈 Trend Setter — Multi-Agent Systems

🐣 Hot Topic Early Bird — collaborative learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Peixi Xiong , Ying Wu

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Applications > Question Answering Natural Language Processing > Applications > Visual Question Answering Deep Learning > Learning Types > Multi-Modal Learning Artificial Intelligence > Core AI > Dialogue Systems Computer Vision > Applications > Visual Question Answering Deep Learning > Learning Types > Multi-Agent Systems

Keywords

visual question answering self-supervised learning question generation collaborative learning multi-agent system multi-agent training

Download PDF

Related papers

Deep Polarization Cues for Transparent Object Segmentation 2020

HRank: Filter Pruning Using High-Rank Feature Map 2020

Panoptic-Based Image Synthesis 2020

Select, Supplement and Focus for RGB-D Saliency Detection 2020

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings 2020