Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Computer Vision
›
Applications
›
Visual Question Answering
107 directly classified papers
Papers per year
2016: 2
2017: 5
2018: 8
2019: 12
2020: 14
2021: 7
2022: 5
2023: 12
2024: 20
2025: 22
Papers
KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts
EMNLP 2025
Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions
ACL 2025
Taxonomy-Aware Evaluation of Vision-Language Models
CVPR 2025
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
EMNLP 2025
Howard University-AI4PC at SemEval-2025 Task 1: Using GPT-4o and CLIP-ViLT to Decode Figurative Language Across Text and Images
ACL 2025
Can MLLMs Understand the Deep Implication Behind Chinese Images?
ACL 2025
Visual Question Answering on Scientific Charts Using Fine-Tuned Vision-Language Models
ACL 2025
Embodied Scene Understanding for Vision Language Models via MetaVQA
CVPR 2025
Seeing Culture: A Benchmark for Visual Reasoning and Grounding
EMNLP 2025
Memory-QA: Answering Recall Questions Based on Multimodal Memories
EMNLP 2025
Coling-UniA at SciVQA 2025: Few-Shot Example Retrieval and Confidence-Informed Ensembling for Multimodal Large Language Models
ACL 2025
Instruction-tuned QwenChart for Chart Question Answering
ACL 2025
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
ACL 2025
One More Modality: Does Abstract Meaning Representation Benefit Visual Question Answering?
EMNLP 2025
MemeQA: Holistic Evaluation for Meme Understanding
ACL 2025
SciVQA 2025: Overview of the First Scientific Visual Question Answering Shared Task
ACL 2025
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
CVPR 2025
When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach
AAAI 2025
Language Repository for Long Video Understanding
ACL 2025
VisTRA: Visual Tool-use Reasoning Analyzer for Small Object Visual Question Answering
ACL 2025
Challenging Multimodal LLMs with African Standardized Exams: A Document VQA Evaluation
ACL 2025
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
ACL 2025
Improving Automatic VQA Evaluation Using Large Language Models
AAAI 2024
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA
AAAI 2024
DIEM: Decomposition-Integration Enhancing Multimodal Insights
CVPR 2024
<
1
2
3
4
5
>