CLUSE: Cross-Lingual Unsupervised Sense Embeddings

Ta-Chung Chi; Yun-Nung Chen

2018 EMNLP EMNLP 2018

CLUSE: Cross-Lingual Unsupervised Sense Embeddings

Abstract

AbstractThis paper proposes a modularized sense induction and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. The model is evaluated on the Stanford Contextual Word Similarity (SCWS) dataset to ensure the quality of monolingual sense embeddings. In addition, we introduce Bilingual Contextual Word Similarity (BCWS), a large and high-quality dataset for evaluating cross-lingual sense embeddings, which is the first attempt of measuring whether the learned embeddings are indeed aligned well in the vector space. The proposed approach shows the superior quality of sense embeddings evaluated in both monolingual and bilingual spaces.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — vector space alignment

🐣 Hot Topic Early Bird — word sense disambiguation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ta-Chung Chi , Yun-Nung Chen

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Core Methods > Embedding Learning Machine Learning > Learning Types > Unsupervised Learning Natural Language Processing > Resources & Methods > Text Representation Interdisciplinary > Linguistics > Semantics Deep Learning > Learning Types > Representation Learning Deep Learning > Learning Types > Unsupervised Learning Artificial Intelligence > Core AI > Natural Language Processing

Keywords

unsupervised learning representation learning word sense disambiguation cross-lingual embedding word embedding word similarity sense embedding bilingual corpus cross-lingual semantics vector space alignment sense induction bilingual sense

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018