2021
EMNLP
EMNLP 2021
Analyzing the Surprising Variability in Word Embedding Stability Across Languages
Abstract
AbstractWord embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g., overlap between the nearest neighbors of a word in different embedding spaces) in diverse languages. We discuss linguistic properties that are related to stability, drawing out insights about correlations with affixing, language gender systems, and other features. This has implications for embedding use, particularly in research that uses them to study language trends.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Embedding Learning
Natural Language Processing > Resources & Methods > Multilingual NLP
Natural Language Processing > Resources & Methods > Text Representation
Interdisciplinary > Linguistics
Deep Learning > Learning Types > Representation Learning
Artificial Intelligence > Core AI > Natural Language Processing