Evaluating Research Novelty Detection: Counterfactual Approaches

Reinald Kim Amplayo; Seung-won Hwang; Min Song

2019 EMNLP EMNLP 2019

Evaluating Research Novelty Detection: Counterfactual Approaches

Abstract

AbstractIn this paper, we explore strategies to evaluate models for the task research paper novelty detection: Given all papers released at a given date, which of the papers discuss new ideas and influence future research? We find the novelty is not a singular concept, and thus inherently lacks of ground truth annotations with cross-annotator agreement, which is a major obstacle in evaluating these models. Test-of-time award is closest to such annotation, which can only be made retrospectively and is extremely scarce. We thus propose to compare and evaluate models using counterfactual simulations. First, we ask models if they can differentiate papers at time t and counterfactual paper from future time t+d. Second, we ask models if they can predict test-of-time award at t+d. These are proxies that can be agreed by human annotators and easily augmented by correlated signals, using which evaluation can be done through four tasks: classification, ranking, correlation and feature selection. We show these proxy evaluation methods complement each other regarding error handling, coverage, interpretability, and scope, and thus altogether contribute to the observation of the relative strength of existing models.

🧭 Keyword Pioneer — counterfactual evaluation

🐣 Hot Topic Early Bird — model evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Reinald Kim Amplayo , Seung-won Hwang , Min Song

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Metric Learning Machine Learning > Optimization & Theory > Statistical Learning Machine Learning > Learning Types > Representation Learning Machine Learning > Core Methods > Ranking Machine Learning > Learning Types > Evaluation

Keywords

feature selection model evaluation novelty detection counterfactual reasoning counterfactual evaluation research evaluation research novelty detection

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019