SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types

Xuanliang Zhang; Dingzirui Wang; Baoxin Wang; Longxu Dou; Xinyuan Lu; Keyan Xu; Dayong Wu; Qingfu Zhu

2025 ACL ACL 2025

SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types

Abstract

AbstractScientific question answering (SQA) is an important task aimed at answering questions based on papers. However, current SQA datasets have limited reasoning types and neglect the relevance between tables and text, creating a significant gap with real scenarios. To address these challenges, we propose a QA benchmark for scientific tables and text with diverse reasoning types (SCITAT). To cover more reasoning types, we summarize various reasoning types from real-world questions. To reason on both tables and text, we require the questions to incorporate tables and text as much as possible. Based on SCITAT, we propose a baseline (CAR), which combines various reasoning methods to address different reasoning types and process tables and text at the same time. CAR brings average improvements of 4.1% over other baselines on SCITAT, validating its effectiveness. Error analysis reveals the challenges of SCITAT, such as complex numerical calculations and domain knowledge.

🧭 Keyword Pioneer — scientific question answering

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Speech & Audio

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning and Natural Language Processing