SKIM at WMT 2023 General Translation Task

Keito Kudo; Takumi Ito; Makoto Morishita; Jun Suzuki

2023 EMNLP EMNLP 2023

SKIM at WMT 2023 General Translation Task

Abstract

AbstractThe SKIM team’s submission used a standard procedure to build ensemble Transformer models, including base-model training, back-translation of base models for data augmentation, and retraining of several final models using back-translated training data. Each final model had its own architecture and configuration, including up to 10.5B parameters, and substituted self- and cross-sublayers in the decoder with a cross+self-attention sub-layer. We selected the best candidate from a large candidate pool, namely 70 translations generated from 13 distinct models for each sentence, using an MBR reranking method using COMET and COMET-QE. We also applied data augmentation and selection techniques to the training data of the Transformer models.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — mbr reranking

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Keito Kudo , Takumi Ito , Makoto Morishita , Jun Suzuki

Topics

Machine Learning > Application Areas > Data Augmentation Natural Language Processing > Applications > Machine Translation

Keywords

machine translation data augmentation model ensemble mbr reranking

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023