2023 EMNLP EMNLP 2023

Challenging the State-of-the-art Machine Translation Metrics from a Linguistic Perspective

Abstract

AbstractWe employ a linguistically motivated challenge set in order to evaluate the state-of-the-art machine translation metrics submitted to the Metrics Shared Task of the 8th Conference for Machine Translation. The challenge set includes about 21,000 items extracted from 155 machine translation systems for three language directions, covering more than 100 linguistically-motivated phenomena organized in 14 categories. The metrics that have the best performance with regard to our linguistically motivated analysis are the Cometoid22-wmt23 (a trained metric based on distillation) for German-English and MetricX-23-c (based on a fine-tuned mT5 encoder-decoder language model) for English-German and English-Russian. Some of the most difficult phenomena are passive voice for German-English, named entities, terminology and measurement units for English-German, and focus particles, adverbial clause and stripping for English-Russian.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio