On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation

Sheila Castilho

2020 EMNLP EMNLP 2020

On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation

Abstract

AbstractDocument-level evaluation of machine translation has raised interest in the community especially since responses to the claims of “human parity” (Toral et al., 2018; Läubli et al., 2018) with document-level human evaluations have been published. Yet, little is known about best practices regarding human evaluation of machine translation at the document-level. This paper presents a comparison of the differences in inter-annotator agreement between quality assessments using sentence and document-level set-ups. We report results of the agreement between professional translators for fluency and adequacy scales, error annotation, and pair-wise ranking, along with the effort needed to perform the different tasks. To best of our knowledge, this is the first study of its kind.

❓ The Questioner

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — fluency assessment

🐣 Hot Topic Early Bird — human evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sheila Castilho

Topics

Natural Language Processing > Applications > Machine Translation Machine Learning > Optimization & Theory > Evaluation Deep Learning > Learning Types > Machine Translation

Keywords

machine translation document-level translation human evaluation translation quality inter-annotator agreement sentence-level translation fluency assessment sentence-level evaluation document-level evaluation document level sentence level

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020