Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Computer Vision
›
Applications
›
Visual Question Answering
107 directly classified papers
Papers per year
2016: 2
2017: 5
2018: 8
2019: 12
2020: 14
2021: 7
2022: 5
2023: 12
2024: 20
2025: 22
Papers
Improving Selective Visual Question Answering by Learning From Your Peers
CVPR 2023
Super-CLEVR: A Virtual Benchmark To Diagnose Domain Robustness in Visual Reasoning
CVPR 2023
Logical Implications for Visual Question Answering Consistency
CVPR 2023
Compressing and Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
EMNLP 2023
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
EMNLP 2022
Dynamic Key-Value Memory Enhanced Multi-Step Graph Reasoning for Knowledge-Based Visual Question Answering
AAAI 2022
Dual-Key Multimodal Backdoors for Visual Question Answering
CVPR 2022
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
ACL 2022
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
NIPS 2022
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
AAAI 2021
CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization
EMNLP 2021
How Transferable Are Reasoning Patterns in VQA?
CVPR 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
EMNLP 2021
Diversity and Consistency: Exploring Visual Question-Answer Pair Generation
EMNLP 2021
TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption
CVPR 2021
Supervising the Transfer of Reasoning Patterns in VQA
NIPS 2021
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
EMNLP 2020
TA-Student VQA: Multi-Agents Training by Self-Questioning
CVPR 2020
In Defense of Grid Features for Visual Question Answering
CVPR 2020
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions
CVPR 2020
Hierarchical Conditional Relation Networks for Video Question Answering
CVPR 2020
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
AAAI 2020
Visual Dialogue State Tracking for Question Generation
AAAI 2020
Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
EMNLP 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
EMNLP 2020
<
1
2
3
4
5
>