Syntactically Supervised Transformers for Faster Neural Machine Translation

Nader Akoury; Kalpesh Krishna; Mohit Iyyer

2019 ACL ACL 2019

Syntactically Supervised Transformers for Faster Neural Machine Translation

Abstract

AbstractStandard decoders for neural machine translation autoregressively generate a single target token per timestep, which slows inference especially for long outputs. While architectural advances such as the Transformer fully parallelize the decoder computations at training time, inference still proceeds sequentially. Recent developments in non- and semi-autoregressive decoding produce multiple tokens per timestep independently of the others, which improves inference speed but deteriorates translation quality. In this work, we propose the syntactically supervised Transformer (SynST), which first autoregressively predicts a chunked parse tree before generating all of the target tokens in one shot conditioned on the predicted parse. A series of controlled experiments demonstrates that SynST decodes sentences ~5x faster than the baseline autoregressive Transformer while achieving higher BLEU scores than most competing methods on En-De and En-Fr datasets.

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — autoregressive decoding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nader Akoury , Kalpesh Krishna , Mohit Iyyer

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Understanding > Parsing Natural Language Processing > Applications > Machine Translation

Keywords

transformer architecture neural machine translation syntactic parsing inference speed autoregressive decoding syntactic supervision

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019