Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Natural Language Processing
›
Applications
›
Visual Question Answering
219 directly classified papers
Papers per year
2016: 1
2017: 6
2018: 13
2019: 26
2020: 22
2021: 23
2022: 20
2023: 20
2024: 37
2025: 49
2026: 2
Papers
E-Logic Prompt: Unified Energy-Logic Framework for Continual Visual Question Answering
AAAI 2026
Landsat30-AU: A Vision-Language Dataset for Australian Landsat Imagery
AAAI 2026
Express What You See: Can Multimodal LLMs Decode Visual Ciphers with Intuitive Semiosis Comprehension?
ACL 2025
Visual Question Answering for Peruvian Cuisine in Regional Spanish
AAAI 2025
Acknowledging Focus Ambiguity in Visual Questions
ICCV 2025
TaiwanVQA: A Benchmark for Visual Question Answering for Taiwanese Daily Life
COLING 2025
Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation
ICCV 2025
OVQA: A Dataset for Visual Question Answering and Multimodal Research in Odia Language
COLING 2025
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
ICCV 2025
Multimodal Commonsense Knowledge Distillation for Visual Question Answering (Student Abstract)
AAAI 2025
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types
COLING 2025
IntelliCockpitBench: A Comprehensive Benchmark to Evaluate VLMs for Intelligent Cockpit
ACL 2025
BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical Modalities
EMNLP 2025
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
COLING 2025
See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models
ACL 2025
Where is this coming from? Making groundedness count in the evaluation of Document VQA models
NAACL 2025
Visual Robustness Benchmark for Visual Question Answering (VQA)
WACV 2025
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
EMNLP 2025
HalLoc: Token-level Localization of Hallucinations for Vision Language Models
CVPR 2025
MSR2: A Benchmark for Multi-Source Retrieval and Reasoning in Visual Question Answering
NAACL 2025
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
NAACL 2025
ALLVB: All-in-One Long Video Understanding Benchmark
AAAI 2025
NLKI: A Lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks
EMNLP 2025
Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding
WACV 2025
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
AAAI 2025
<
1
2
3
4
5
…
9
>