Multifaceted Challenge Set for Evaluating Machine Translation Performance

Xiaoyu Chen; Daimeng Wei; Zhanglin Wu; Ting Zhu; Hengchao Shang; Zongyao Li; Jiaxin Guo; Ning Xie; Lizhi Lei; Hao Yang; Yanfei Jiang

2023 EMNLP EMNLP 2023

Multifaceted Challenge Set for Evaluating Machine Translation Performance

Abstract

AbstractMachine Translation Evaluation is critical to Machine Translation research, as the evaluation results reflect the effectiveness of training strategies. As a result, a fair and efficient evaluation method is necessary. Many researchers have raised questions about currently available evaluation metrics from various perspectives, and propose suggestions accordingly. However, to our knowledge, few researchers has analyzed the difficulty level of source sentence and its influence on evaluation results. This paper presents HW-TSC’s submission to the WMT23 MT Test Suites shared task. We propose a systematic approach for construing challenge sets from four aspects: word difficulty, length difficulty, grammar difficulty and model learning difficulty. We open-source two Multifaceted Challenge Sets for Zh→En and En→Zh. We also present results of participants in this year’s General MT shared task on our test sets.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Xiaoyu Chen , Daimeng Wei , Zhanglin Wu , Ting Zhu , Hengchao Shang , Zongyao Li , Jiaxin Guo , Ning Xie , Lizhi Lei , Hao Yang , Yanfei Jiang

Topics

Natural Language Processing > Applications > Machine Translation

Keywords

neural machine translation evaluation metric translation quality automatic evaluation test suite machine translation evaluation challenge set

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023