2020
EMNLP
EMNLP 2020
Disentangling dialects: a neural approach to Indo-Aryan historical phonology and subgrouping
Abstract
AbstractThis paper seeks to uncover patterns of sound change across Indo-Aryan languages using an LSTM encoder-decoder architecture. We augment our models with embeddings represent-ing language ID, part of speech, and other features such as word embeddings. We find that a highly augmented model shows highest accuracy in predicting held-out forms, and investigate other properties of interest learned by our models’ representations. We outline extensions to this architecture that can better capture variation in Indo-Aryan sound change.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— dialect subgrouping
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Deep Learning > Architectures > Neural Networks
Natural Language Processing > Resources & Methods > Lexical Semantics
Natural Language Processing > Resources & Methods > Multilingual NLP
Interdisciplinary > Linguistics
Artificial Intelligence > Core AI > Language
Deep Learning > Architectures > Recurrent Neural Networks