Quality Estimation via Backtranslation at the WMT 2022 Quality Estimation Task

Sweta Agrawal; Nikita Mehandru; Niloufar Salehi; Marine Carpuat

2022 EMNLP EMNLP 2022

Quality Estimation via Backtranslation at the WMT 2022 Quality Estimation Task

Abstract

AbstractThis paper describes submission to the WMT 2022 Quality Estimation shared task (Task 1: sentence-level quality prediction). We follow a simple and intuitive approach, which consists of estimating MT quality by automatically back-translating hypotheses into the source language using a multilingual MT system. We then compare the resulting backtranslation with the original source using standard MT evaluation metrics. We find that even the best-performing backtranslation-based scores perform substantially worse than supervised QE systems, including the organizers’ baseline. However, combining backtranslation-based metrics with off-the-shelf QE scorers improves correlation with human judgments, suggesting that they can indeed complement a supervised QE system.

🌉 Interdisciplinary Bridge — Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — translation evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio