Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder

Fahim Dalvi; Nadir Durrani; Hassan Sajjad; Yonatan Belinkov; Stephan Vogel

2017 IJCNLP IJCNLP 2017

Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder

Abstract

AbstractEnd-to-end training makes the neural machine translation (NMT) architecture simpler, yet elegant compared to traditional statistical machine translation (SMT). However, little is known about linguistic patterns of morphology, syntax and semantics learned during the training of NMT systems, and more importantly, which parts of the architecture are responsible for learning each of these phenomenon. In this paper we i) analyze how much morphology an NMT decoder learns, and ii) investigate whether injecting target morphology in the decoder helps it to produce better translations. To this end we present three methods: i) simultaneous translation, ii) joint-data learning, and iii) multi-task learning. Our results show that explicit morphological information helps the decoder learn target language morphology and improves the translation quality by 0.2–0.6 BLEU points.

🧭 Keyword Pioneer — morphological learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Fahim Dalvi , Nadir Durrani , Hassan Sajjad , Yonatan Belinkov , Stephan Vogel

Topics

Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation

Keywords

multi-task learning neural machine translation morphological learning morphology injection joint-data learning

Download PDF

Related papers

Procedural Text Generation from an Execution Video 2017

DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset 2017

Roles and Success in Wikipedia Talk Pages: Identifying Latent Patterns of Behavior 2017

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts 2017

Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task 2017