Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

Naveen Arivazhagan; Colin Cherry; Wolfgang Macherey; Chung-Cheng Chiu; Semih Yavuz; Ruoming Pang; Wei Li; Colin Raffel

2019 ACL ACL 2019

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

Abstract

AbstractSimultaneous machine translation begins to translate each source sentence before the source speaker is finished speaking, with applications to live and streaming scenarios. Simultaneous systems must carefully schedule their reading of the source sentence to balance quality against latency. We present the first simultaneous translation system to learn an adaptive schedule jointly with a neural machine translation (NMT) model that attends over all source tokens read thus far. We do so by introducing Monotonic Infinite Lookback (MILk) attention, which maintains both a hard, monotonic attention head to schedule the reading of the source sentence, and a soft attention head that extends from the monotonic head back to the beginning of the source. We show that MILk’s adaptive schedule allows it to arrive at latency-quality trade-offs that are favorable to those of a recently proposed wait-k strategy for many latency values.

🧭 Keyword Pioneer — simultaneous machine translation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

Authors

Naveen Arivazhagan , Colin Cherry , Wolfgang Macherey , Chung-Cheng Chiu , Semih Yavuz , Ruoming Pang , Wei Li , Colin Raffel

Topics

Natural Language Processing > Applications > Machine Translation Deep Learning > Techniques > Attention

Keywords

attention mechanism neural machine translation latency optimization simultaneous machine translation adaptive schedule monotonic attention latency-quality trade-off

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019