Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

Kai Song; Kun Wang; Heng Yu; Yue Zhang; Zhongqiang Huang; Weihua Luo; Xiangyu Duan; Min Zhang

2020 AAAI AAAI 2020

Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

Abstract

Abstract We investigate the task of constraining NMT with pre-specified translations, which has practical significance for a number of research and industrial applications. Existing works impose pre-specified translations as lexical constraints during decoding, which are based on word alignments derived from target-to-source attention weights. However, multiple recent studies have found that word alignment derived from generic attention heads in the Transformer is unreliable. We address this problem by introducing a dedicated head in the multi-head Transformer architecture to capture external supervision signals. Results on five language pairs show that our method is highly effective in constraining NMT with pre-specified translations, consistently outperforming previous methods in translation quality.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kai Song , Kun Wang , Heng Yu , Yue Zhang , Zhongqiang Huang , Weihua Luo , Xiangyu Duan , Min Zhang

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation Deep Learning > Models > Transformers Machine Learning > Learning Types > Machine Translation

Keywords

transformer architecture attention mechanism neural machine translation word alignment multi-head attention lexical constraint

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020