Are References Really Needed? Unbabel-IST 2021 Submission for the Metrics Shared Task

Ricardo Rei; Ana C Farinha; Chrysoula Zerva; Daan van Stigt; Craig Stewart; Pedro Ramos; Taisiya Glushkova; André F. T. Martins; Alon Lavie

2021 EMNLP EMNLP 2021

Are References Really Needed? Unbabel-IST 2021 Submission for the Metrics Shared Task

Abstract

AbstractIn this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics Shared Task. With this year’s focus on Multidimensional Quality Metric (MQM) as the ground-truth human assessment, our aim was to steer COMET towards higher correlations with MQM. We do so by first pre-training on Direct Assessments and then fine-tuning on z-normalized MQM scores. In our experiments we also show that reference-free COMET models are becoming competitive with reference-based models, even outperforming the best COMET model from 2020 on this year’s development data. Additionally, we present COMETinho, a lightweight COMET model that is 19x faster on CPU than the original model, while also achieving state-of-the-art correlations with MQM. Finally, in the “QE as a metric” track, we also participated with a QE model trained using the OpenKiwi framework leveraging MQM scores and word-level annotations.

❓ The Questioner

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ricardo Rei , Ana C Farinha , Chrysoula Zerva , Daan van Stigt , Craig Stewart , Pedro Ramos , Taisiya Glushkova , André F. T. Martins , Alon Lavie

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Machine Translation Machine Learning > Learning Types > Supervised Learning Machine Learning > Learning Types > Evaluation Deep Learning > Models > Transformers Natural Language Processing > Applications > Quality Estimation

Keywords

transfer learning machine translation quality estimation language model reference-free evaluation quality evaluation neural network direct assessment pre-training fine-tuning

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021