2020
EMNLP
EMNLP 2020
Predicting Typological Features in WALS using Language Embeddings and Conditional Probabilities: ÚFAL Submission to the SIGTYP 2020 Shared Task
Abstract
AbstractWe present our submission to the SIGTYP 2020 Shared Task on the prediction of typological features. We submit a constrained system, predicting typological features only based on the WALS database. We investigate two approaches. The simpler of the two is a system based on estimating correlation of feature values within languages by computing conditional probabilities and mutual information. The second approach is to train a neural predictor operating on precomputed language embeddings based on WALS features. Our submitted system combines the two approaches based on their self-estimated confidence scores. We reach the accuracy of 70.7% on the test data and rank first in the shared task.
🌉
Interdisciplinary Bridge
— Interdisciplinary and Machine Learning
🧭
Keyword Pioneer
— neural predictor
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Classification
Machine Learning > Core Methods > Embedding Learning
Machine Learning > Optimization & Theory > Statistical Learning
Interdisciplinary > Linguistics
Interdisciplinary > Linguistics > Computational Linguistics
Machine Learning > Learning Types > Representation Learning
Machine Learning > Learning Types > Multi-Label Classification