Linguistically Motivated Evaluation of the 2023 State-of-the-art Machine Translation: Can ChatGPT Outperform NMT?

Shushen Manakhimova; Eleftherios Avramidis; Vivien Macketanz; Ekaterina Lapshinova-Koltunski; Sergei Bagdasarov; Sebastian Möller

2023 EMNLP EMNLP 2023

Linguistically Motivated Evaluation of the 2023 State-of-the-art Machine Translation: Can ChatGPT Outperform NMT?

Abstract

AbstractThis paper offers a fine-grained analysis of the machine translation outputs in the context of the Shared Task at the 8th Conference of Machine Translation (WMT23). Building on the foundation of previous test suite efforts, our analysis includes Large Language Models and an updated test set featuring new linguistic phenomena. To our knowledge, this is the first fine-grained linguistic analysis for the GPT-4 translation outputs. Our evaluation spans German-English, English-German, and English-Russian language directions. Some of the phenomena with the lowest accuracies for German-English are idioms and resultative predicates. For English-German, these include mediopassive voice, and noun formation(er). As for English-Russian, these included idioms and semantic roles. GPT-4 performs equally or comparably to the best systems in German-English and English-German but falls in the second significance cluster for English-Russian.

❓ The Questioner

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shushen Manakhimova , Eleftherios Avramidis , Vivien Macketanz , Ekaterina Lapshinova-Koltunski , Sergei Bagdasarov , Sebastian Möller

Topics

Natural Language Processing > Applications > Machine Translation Natural Language Processing > Resources & Methods > Large Language Models

Keywords

machine translation neural machine translation evaluation metric linguistic analysis fine-grained evaluation large language model

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023