2023
EACL
EACL 2023
Token-level Identification of Multiword Expressions using Pre-trained Multilingual Language Models
Abstract
AbstractIn this paper, we consider novel cross-lingual settings for multiword expression (MWE) identification (Ramisch et al., 2020) and idiomaticity prediction (Tayyar Madabushi et al., 2022) in which systems are tested on languages that are unseen during training. Our findings indicate that pre-trained multilingual language models are able to learn knowledge about MWEs and idiomaticity that is not languagespecific. Moreover, we find that training data from other languages can be leveraged to give improvements over monolingual models.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Natural Language Processing
🧭
Keyword Pioneer
— idiomaticity prediction
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio