Analyzing the Domain Robustness of Pretrained Language Models, Layer by Layer

Abhinav Ramesh Kashyap; Laiba Mehnaz; Bhavitvya Malik; Abdul Waheed; Devamanyu Hazarika; Min-Yen Kan; Rajiv Ratn Shah

2021 EACL EACL 2021

Analyzing the Domain Robustness of Pretrained Language Models, Layer by Layer

Abstract

AbstractThe robustness of pretrained language models(PLMs) is generally measured using performance drops on two or more domains. However, we do not yet understand the inherent robustness achieved by contributions from different layers of a PLM. We systematically analyze the robustness of these representations layer by layer from two perspectives. First, we measure the robustness of representations by using domain divergence between two domains. We find that i) Domain variance increases from the lower to the upper layers for vanilla PLMs; ii) Models continuously pretrained on domain-specific data (DAPT)(Gururangan et al., 2020) exhibit more variance than their pretrained PLM counterparts; and that iii) Distilled models (e.g., DistilBERT) also show greater domain variance. Second, we investigate the robustness of representations by analyzing the encoded syntactic and semantic information using diagnostic probes. We find that similar layers have similar amounts of linguistic information for data from an unseen domain.

🧭 Keyword Pioneer — domain divergence

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Abhinav Ramesh Kashyap , Laiba Mehnaz , Bhavitvya Malik , Abdul Waheed , Devamanyu Hazarika , Min-Yen Kan , Rajiv Ratn Shah

Topics

Machine Learning > Application Areas > Domain Adaptation Machine Learning > Application Areas > Domain Generalization

Keywords

layer-wise analysis pretrained language model domain robustness domain divergence diagnostic probe

Download PDF

Related papers

Joint Coreference Resolution and Character Linking for Multiparty Conversation 2021

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO 2021

Representations for Question Answering from Documents with Tables and Text 2021

Gender and Racial Fairness in Depression Research using Social Media 2021