Marian: Cost-effective High-Quality Neural Machine Translation in C++

Marcin Junczys-Dowmunt; Kenneth Heafield; Hieu Hoang; Roman Grundkiewicz; Anthony Aue

2018 ACL ACL 2018

Marian: Cost-effective High-Quality Neural Machine Translation in C++

Abstract

AbstractThis paper describes the submissions of the “Marian” team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

📈 Trend Setter — Model Compression

🧭 Keyword Pioneer — transformer model

🐣 Hot Topic Early Bird — neural machine translation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Marcin Junczys-Dowmunt , Kenneth Heafield , Hieu Hoang , Roman Grundkiewicz , Anthony Aue

Topics

Artificial Intelligence > Core AI > Model Compression Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation Deep Learning > Optimization & Theory > Model Compression Deep Learning > Techniques > Knowledge Distillation Deep Learning > Optimization & Theory > Efficient Computing

Keywords

knowledge distillation neural machine translation model averaging transformer optimization low-precision computation model optimization gpu optimization teacher-student training transformer model

Download PDF

Related papers

Economic Event Detection in Company-Specific News Text 2018

Investigating Effective Parameters for Fine-tuning of Word Embeddings Using Only a Small Corpus 2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment 2018

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer 2018

Affordances in Grounded Language Learning 2018