A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

Michael Boratko; Harshit Padigela; Divyendra Mikkilineni; Pritish Yuvraj; Rajarshi Das; Andrew McCallum; Maria Chang; Achille Fokoue-Nkoutche; Pavan Kapanipathi; Nicholas Mattei; Ryan Musa; Kartik Talamadupula; Michael Witbrock

2018 ACL ACL 2018

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

Abstract

AbstractThe recent work of Clark et al. (2018) introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into easy and challenge sets. That paper includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them; however, it does not include clear definitions of these types, nor does it offer information about the quality of the labels. We propose a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset. Using ten annotators and a sophisticated annotation interface, we analyze the distribution of labels across the challenge set and statistics related to them. Additionally, we demonstrate that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus. Evaluating with human-selected relevant sentences improves the performance of a neural machine comprehension model by 42 points.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Knowledge & Reasoning and Natural Language Processing

📈 Trend Setter — Knowledge

🧭 Keyword Pioneer — knowledge classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Michael Boratko , Harshit Padigela , Divyendra Mikkilineni , Pritish Yuvraj , Rajarshi Das , Andrew McCallum , Maria Chang , Achille Fokoue-Nkoutche , Pavan Kapanipathi , Nicholas Mattei , Ryan Musa , Kartik Talamadupula , Michael Witbrock

Topics

Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Applications > Machine Reading Comprehension Natural Language Processing > Applications > Question Answering Knowledge & Reasoning > Representation > Knowledge Representation Knowledge & Reasoning > Reasoning > Automated Reasoning Artificial Intelligence > Core AI > Reasoning Artificial Intelligence > Core AI > Knowledge

Keywords

question answering information retrieval machine reading comprehension automated reasoning dataset annotation science question knowledge classification neural comprehension model

Download PDF

Related papers

Economic Event Detection in Company-Specific News Text 2018

Investigating Effective Parameters for Fine-tuning of Word Embeddings Using Only a Small Corpus 2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment 2018

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer 2018

Affordances in Grounded Language Learning 2018