Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Natural Language Processing
›
Applications
›
Visual Question Answering
219 directly classified papers
Papers per year
2016: 1
2017: 6
2018: 13
2019: 26
2020: 22
2021: 23
2022: 20
2023: 20
2024: 37
2025: 49
2026: 2
Papers
Visual Programming: Compositional Visual Reasoning Without Training
CVPR 2023
You Can Ground Earlier Than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
CVPR 2023
LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering
NIPS 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
NIPS 2023
Unified Language Representation for Question Answering over Text, Tables, and Images
ACL 2023
Filling the Image Information Gap for VQA: Prompting Large Language Models to Proactively Ask Questions
EMNLP 2023
Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamic Audio-Visual Scenarios
EMNLP 2023
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images
AAAI 2023
Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA
EMNLP 2022
What’s Different between Visual Question Answering for Machine “Understanding” Versus for Accessibility?
IJCNLP 2022
All You May Need for VQA are Image Captions
NAACL 2022
VGNMN: Video-grounded Neural Module Networks for Video-Grounded Dialogue Systems
NAACL 2022
Declaration-based Prompt Tuning for Visual Question Answering
IJCAI 2022
Dynamic Key-Value Memory Enhanced Multi-Step Graph Reasoning for Knowledge-Based Visual Question Answering
AAAI 2022
Multi-VQG: Generating Engaging Questions for Multiple Images
EMNLP 2022
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
EMNLP 2022
V-Doc: Visual Questions Answers With Documents
CVPR 2022
Query and Attention Augmentation for Knowledge-Based Explainable Reasoning
CVPR 2022
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-Based Visual Question Answering
CVPR 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
CVPR 2022
WebQA: Multihop and Multimodal QA
CVPR 2022
Maintaining Reasoning Consistency in Compositional Visual Question Answering
CVPR 2022
Improving Visual Grounding With Visual-Linguistic Verification and Iterative Reasoning
CVPR 2022
Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues
ACL 2022
Co-VQA : Answering by Interactive Sub Question Sequence
ACL 2022
<
1
…
4
5
6
…
9
>