← Applications

Natural Language Processing › Applications ›

Question Answering

4032 directly classified papers

Papers per year

Papers

MapVerse: A Benchmark for Geospatial Question Answering on Diverse Real-World Maps WACV 2026

Knowing What’s Missing: Assessing Information Sufficiency in Question Answering EACL 2026

Classifying and Addressing the Diversity of Errors in Retrieval-Augmented Generation Systems EACL 2026

Korean Canonical Legal Benchmark: Toward Knowledge-Independent Evaluation of LLMs’ Legal Reasoning Capabilities EACL 2026

DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router EACL 2026

Evaluating Multi-Hop Reasoning in Large Language Models: A Chemistry-Centric Benchmark EACL 2026

Tables Decoded: DELTA for Structure, TARQA for Understanding WACV 2026

Stochastic Parrots or True Virtuosos? Digging Deeper Into the Audio-Video Understanding of AVQA Models EACL 2026

PMWP: A Benchmark for Math Word Problem Solving in Persian EACL 2026

One Language, Three of Its Voices: Evaluating Multilingual LLMs Across Persian, Dari, and Tajiki on Translation and Understanding Tasks EACL 2026

FAST-EQA: Efficient Embodied Question Answering with Global and Local Region Relevancy WACV 2026

The Problem of Ambiguity in Table Question Answering EACL 2026

DF-RAG: Query-Aware Diversity for Retrieval-Augmented Generation EACL 2026

TruthTrap: A Bilingual Benchmark for Evaluating Factually Correct Yet Misleading Information in Question Answering EACL 2026

DRIVINGVQA: A Dataset for Interleaved Visual Chain-of-Thought in Real-World Driving Scenarios EACL 2026

DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards EACL 2026

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding EACL 2026

U-MIRAGE: Benchmarking Chain-of-Thought Reasoning for Urdu Medical QA EACL 2026

Building a Conversational AI Assistant for African Travel Services with LLMs and RAG EACL 2026

Who Judges the Judge? Evaluating LLM-as-a-Judge for French Medical open-ended QA EACL 2026

Cross-Lingual Empirical Evaluation of Large Language Models for Arabic Medical Tasks EACL 2026

Exploring Generative Process Reward Modeling for Semi-Structured Data: A Case Study of Table Question Answering EACL 2026

Comprehensive Comparison of RAG Methods Across Multi-Domain Conversational QA EACL 2026

Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision–Language Models EACL 2026

Retrieval Enhancements for RAG: Insights from a Deployed Customer Support Chatbot EACL 2026