MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction

Pengcheng Yang; Fuli Luo; Peng Chen; Tianyu Liu; Xu Sun

2019 ACL ACL 2019

MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction

Abstract

AbstractThe task of unsupervised bilingual lexicon induction (UBLI) aims to induce word translations from monolingual corpora in two languages. Previous work has shown that morphological variation is an intractable challenge for the UBLI task, where the induced translation in failure case is usually morphologically related to the correct translation. To tackle this challenge, we propose a morphology-aware alignment model for the UBLI task. The proposed model aims to alleviate the adverse effect of morphological variation by introducing grammatical information learned by the pre-trained denoising language model. Results show that our approach can substantially outperform several state-of-the-art unsupervised systems, and even achieves competitive performance compared to supervised methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — cross-lingual induction

🐣 Hot Topic Early Bird — word alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Pengcheng Yang , Fuli Luo , Peng Chen , Tianyu Liu , Xu Sun

Topics

Artificial Intelligence > Core AI > Causal Inference Machine Learning > Learning Types > Unsupervised Learning Machine Learning > Learning Types > Representation Learning Deep Learning > Learning Types > Unsupervised Learning Natural Language Processing > Applications > Natural Language Understanding

Keywords

unsupervised learning word alignment cross-lingual alignment bilingual lexicon induction morphological variation cross-lingual induction denoising language model

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019