A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications

Dongyeop Kang; Waleed Ammar; Bhavana Dalvi; Madeleine van Zuylen; Sebastian Kohlmeier; Eduard Hovy; Roy Schwartz

2018 NAACL NAACL 2018

A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications

Abstract

AbstractPeer reviewing is a central component in the scientific publishing process. We present the first public dataset of scientific peer reviews available for research purposes (PeerRead v1),1 providing an opportunity to study this important artifact. The dataset consists of 14.7K paper drafts and the corresponding accept/reject decisions in top-tier venues including ACL, NIPS and ICLR. The dataset also includes 10.7K textual peer reviews written by experts for a subset of the papers. We describe the data collection process and report interesting observed phenomena in the peer reviews. We also propose two novel NLP tasks based on this dataset and provide simple baseline models. In the first task, we show that simple models can predict whether a paper is accepted with up to 21% error reduction compared to the majority baseline. In the second task, we predict the numerical scores of review aspects and show that simple models can outperform the mean baseline for aspects with high variance such as ‘originality’ and ‘impact’.

🧭 Keyword Pioneer — review score prediction

🐣 Hot Topic Early Bird — scientific literature

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Dongyeop Kang , Waleed Ammar , Bhavana Dalvi , Madeleine van Zuylen , Sebastian Kohlmeier , Eduard Hovy , Roy Schwartz

Topics

Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Natural Language Inference Natural Language Processing > Applications > Sentiment Analysis

Keywords

natural language processing text classification scientific writing peer review scientific literature review score prediction accept/reject prediction

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018