Aligning Multilingual Embeddings for Improved Code-switched Natural Language Understanding

Barah Fazili; Preethi Jyothi

2022 COLING COLING 2022

Aligning Multilingual Embeddings for Improved Code-switched Natural Language Understanding

Abstract

AbstractMultilingual pretrained models, while effective on monolingual data, need additional training to work well with code-switched text. In this work, we present a novel idea of training multilingual models with alignment objectives using parallel text so as to explicitly align word representations with the same underlying semantics across languages. Such an explicit alignment step has a positive downstream effect and improves performance on multiple code-switched NLP tasks. We explore two alignment strategies and report improvements of up to 7.32%, 0.76% and 1.9% on Hindi-English Sentiment Analysis, Named Entity Recognition and Question Answering tasks compared to a competitive baseline model.

🌉 Interdisciplinary Bridge — Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — multilingual embedding alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Barah Fazili , Preethi Jyothi

Topics

Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Resources & Methods > Text Representation Interdisciplinary > Linguistics > Computational Linguistics Natural Language Processing > Applications > Sentiment Analysis Natural Language Processing > Applications > Named Entity Recognition Natural Language Processing > Resources & Methods > Transfer Learning Deep Learning > Learning Types > Contrastive Learning

Keywords

transfer learning domain adaptation sentiment analysis cross-lingual transfer question answering named entity recognition word alignment multilingual embedding multilingual embedding alignment

Download PDF

Related papers

MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation 2022

The Role of Context and Uncertainty in Shallow Discourse Parsing 2022

SelfMix: Robust Learning against Textual Label Noise with Self-Mixup Training 2022

Complicate Then Simplify: A Novel Way to Explore Pre-trained Models for Text Classification 2022

Repo4QA: Answering Coding Questions via Dense Retrieval on GitHub Repositories 2022