Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation

Qingsong Ma; Yvette Graham; Timothy Baldwin; Qun Liu

2017 EMNLP EMNLP 2017

Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation

Abstract

AbstractMonolingual evaluation of Machine Translation (MT) aims to simplify human assessment by requiring assessors to compare the meaning of the MT output with a reference translation, opening up the task to a much larger pool of genuinely qualified evaluators. Monolingual evaluation runs the risk, however, of bias in favour of MT systems that happen to produce translations superficially similar to the reference and, consistent with this intuition, previous investigations have concluded monolingual assessment to be strongly biased in this respect. On re-examination of past analyses, we identify a series of potential analytical errors that force some important questions to be raised about the reliability of past conclusions, however. We subsequently carry out further investigation into reference bias via direct human assessment of MT adequacy via quality controlled crowd-sourcing. Contrary to both intuition and past conclusions, results for show no significant evidence of reference bias in monolingual evaluation of MT.

🌱 Topic Pioneer — Quality Estimation

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Quality Estimation

🧭 Keyword Pioneer — human evaluation

🐣 Hot Topic Early Bird — human evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Qingsong Ma , Yvette Graham , Timothy Baldwin , Qun Liu

Topics

Natural Language Processing > Applications > Machine Translation Machine Learning > Learning Types > Evaluation Natural Language Processing > Applications > Quality Estimation

Keywords

machine translation evaluation methodology human evaluation translation quality reference bia monolingual evaluation

Download PDF

Related papers

Reinforced Video Captioning with Entailment Rewards 2017

Cross-lingual Character-Level Neural Morphological Tagging 2017

Inter-Weighted Alignment Network for Sentence Pair Modeling 2017

Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings 2017

An Empirical Analysis of Edit Importance between Document Versions 2017