Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Natural Language Processing
›
Applications
›
Visual Question Answering
219 directly classified papers
Papers per year
2016: 1
2017: 6
2018: 13
2019: 26
2020: 22
2021: 23
2022: 20
2023: 20
2024: 37
2025: 49
2026: 2
Papers
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
AAAI 2025
Plot Twist: Multimodal Models Don’t Comprehend Simple Chart Details
EMNLP 2024
CommVQA: Situating Visual Question Answering in Communicative Contexts
EMNLP 2024
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
CVPR 2024
Detection-Based Intermediate Supervision for Visual Question Answering
AAAI 2024
Large Language Models Know What is Key Visual Entity: An LLM-assisted Multimodal Retrieval for VQA
EMNLP 2024
JDocQA: Japanese Document Question Answering Dataset for Generative Language Models
COLING 2024
VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models
AAAI 2024
CPSeg: Finer-Grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
WACV 2024
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
CVPR 2024
Benchmarking Out-of-Distribution Detection in Visual Question Answering
WACV 2024
Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQA
EMNLP 2024
Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering
ACL 2024
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding
WACV 2024
MIVC: Multiple Instance Visual Component for Visual-Language Models
WACV 2024
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
AAAI 2024
Rethinking Two-Stage Referring Expression Comprehension: A Novel Grounding and Segmentation Method Modulated by Point
AAAI 2024
How to Configure Good In-Context Sequence for Visual Question Answering
CVPR 2024
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
CVPR 2024
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM
CVPR 2024
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing
ACL 2024
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA
ACL 2024
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models
CVPR 2024
AACP: Aesthetics Assessment of Children’s Paintings Based on Self-Supervised Learning
AAAI 2024
Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos
CVPR 2024
<
1
2
3
4
5
…
9
>