Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary

Xing Shi; Kevin Knight

2017 ACL ACL 2017

Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary

Abstract

AbstractWe speed up Neural Machine Translation (NMT) decoding by shrinking run-time target vocabulary. We experiment with two shrinking approaches: Locality Sensitive Hashing (LSH) and word alignments. Using the latter method, we get a 2x overall speed-up over a highly-optimized GPU implementation, without hurting BLEU. On certain low-resource language pairs, the same methods improve BLEU by 0.5 points. We also report a negative result for LSH on GPUs, due to relatively large overhead, though it was successful on CPUs. Compared with Locality Sensitive Hashing (LSH), decoding with word alignments is GPU-friendly, orthogonal to existing speedup methods and more robust across language pairs.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — decoding speedup

🐣 Hot Topic Early Bird — word alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Xing Shi , Kevin Knight

Topics

Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Applications > Machine Translation

Keywords

neural machine translation word alignment decoding speedup vocabulary pruning

Download PDF

Related papers

A* CCG Parsing with a Supertag and Dependency Factored Model 2017

Detecting annotation noise in automatically labelled data 2017

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2017

Annotating tense, mood and voice for English, French and German 2017

Word Embedding for Response-To-Text Assessment of Evidence 2017