2024
EACL
EACL 2024
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
Abstract
AbstractCreating neural text encoders for written Swiss German is challenging due to a dearth of training data combined with dialectal variation. In this paper, we build on several existing multilingual encoders and adapt them to Swiss German using continued pre-training. Evaluation on three diverse downstream tasks shows that simply adding a Swiss German adapter to a modular encoder achieves 97.5% of fully monolithic adaptation performance. We further find that for the task of retrieving Swiss German sentences given Standard German queries, adapting a character-level model is more effective than the other adaptation strategies. We release our code and the models trained for our experiments.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— modular adaptation
🐣
Hot Topic Early Bird
— cross-lingual retrieval
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Speech & Audio