Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Devendra Sachan; Graham Neubig

2018 EMNLP EMNLP 2018

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Abstract

AbstractIn multilingual neural machine translation, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained models. However, these improvements are not uniform; often multilingual parameter sharing results in a decrease in accuracy due to translation models not being able to accommodate different languages in their limited parameter space. In this work, we examine parameter sharing techniques that strike a happy medium between full sharing and individual training, specifically focusing on the self-attentional Transformer model. We find that the full parameter sharing approach leads to increases in BLEU scores mainly when the target languages are from a similar language family. However, even in the case where target languages are from different families where full parameter sharing leads to a noticeable drop in BLEU scores, our proposed methods for partial sharing of parameters can lead to substantial improvements in translation accuracy.

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

🐣 Hot Topic Early Bird — multilingual translation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Devendra Sachan , Graham Neubig

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Machine Translation Deep Learning > Learning Types > Transfer Learning

Keywords

multilingual translation neural machine translation parameter sharing multilingual neural machine translation language family transformer model

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018