Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation

Sheila Castilho

2021 EACL EACL 2021

Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation

Abstract

AbstractDocument-level human evaluation of machine translation (MT) has been raising interest in the community. However, little is known about the issues of using document-level methodologies to assess MT quality. In this article, we compare the inter-annotator agreement (IAA) scores, the effort to assess the quality in different document-level methodologies, and the issue of misevaluation when sentences are evaluated out of context.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sheila Castilho

Topics

Machine Learning > Optimization & Theory > Statistical Learning Natural Language Processing > Applications > Machine Translation

Keywords

machine translation human evaluation translation quality inter-annotator agreement document-level evaluation

Download PDF

Related papers

Joint Coreference Resolution and Character Linking for Multiparty Conversation 2021

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO 2021

Representations for Question Answering from Documents with Tables and Text 2021

Gender and Racial Fairness in Depression Research using Social Media 2021