Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation

Marianna Apidianaki; Guillaume Wisniewski; Anne Cocos; Chris Callison-Burch

2018 NAACL NAACL 2018

Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation

Abstract

AbstractWe propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions. The original HyTER metric relied on hand-crafted paraphrase networks which restricted its applicability to new data. We test, for the first time, HyTER with automatically built paraphrase lattices. We show that although the metric obtains good results on small and carefully curated data with both manually and automatically selected substitutes, it achieves medium performance on much larger and noisier datasets, demonstrating the limits of the metric for tuning and evaluation of current MT systems.

🧭 Keyword Pioneer — reference translation

🐣 Hot Topic Early Bird — evaluation metric

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Marianna Apidianaki , Guillaume Wisniewski , Anne Cocos , Chris Callison-Burch

Topics

Natural Language Processing > Applications > Machine Translation Natural Language Processing > Resources & Methods > Text Representation Natural Language Processing > Generation > Machine Translation

Keywords

machine translation evaluation metric machine translation evaluation reference translation paraphrase lattice hyter metric meaning equivalence

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018