2018
EMNLP
EMNLP 2018
Word Embeddings for Code-Mixed Language Processing
Abstract
AbstractWe compare three existing bilingual word embedding approaches, and a novel approach of training skip-grams on synthetic code-mixed text generated through linguistic models of code-mixing, on two tasks - sentiment analysis and POS tagging for code-mixed text. Our results show that while CVM and CCA based embeddings perform as well as the proposed embedding technique on semantic and syntactic tasks respectively, the proposed approach provides the best performance for both tasks overall. Thus, this study demonstrates that existing bilingual embedding techniques are not ideal for code-mixed text processing and there is a need for learning multilingual word embedding from the code-mixed text.
🌉
Interdisciplinary Bridge
— Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🐣
Hot Topic Early Bird
— code-mixed language
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Embedding Learning
Deep Learning > Architectures > Neural Networks
Natural Language Processing > Resources & Methods > Lexical Semantics
Natural Language Processing > Resources & Methods > Text Representation
Interdisciplinary > Linguistics > Computational Linguistics