2021 NAACL NAACL 2021

Morph Call: Probing Morphosyntactic Content of Multilingual Transformers

Abstract

AbstractThe outstanding performance of transformer-based language models on a great variety of NLP and NLU tasks has stimulated interest in exploration of their inner workings. Recent research has been primarily focused on higher-level and complex linguistic phenomena such as syntax, semantics, world knowledge and common-sense. The majority of the studies is anglocentric, and little remains known regarding other languages, specifically their morphosyntactic properties. To this end, our work presents Morph Call, a suite of 46 probing tasks for four Indo-European languages of different morphology: Russian, French, English and German. We propose a new type of probing tasks based on detection of guided sentence perturbations. We use a combination of neuron-, layer- and representation-level introspection techniques to analyze the morphosyntactic content of four multilingual transformers, including their understudied distilled versions. Besides, we examine how fine-tuning on POS-tagging task affects the probing performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — morphosyntactic probing task
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio