On the Interplay Between Fine-tuning and Sentence-Level Probing for Linguistic Knowledge in Pre-Trained Transformers

Marius Mosbach; Anna Khokhlova; Michael A. Hedderich; Dietrich Klakow

2020 EMNLP EMNLP 2020

On the Interplay Between Fine-tuning and Sentence-Level Probing for Linguistic Knowledge in Pre-Trained Transformers

Abstract

AbstractFine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — probing accuracy

🐣 Hot Topic Early Bird — model adaptation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Marius Mosbach , Anna Khokhlova , Michael A. Hedderich , Dietrich Klakow

Topics

Machine Learning > Optimization & Theory > Learning Theory Deep Learning > Architectures > Transformers Natural Language Processing > Resources & Methods > Text Representation Deep Learning > Learning Types > Fine-Tuning Deep Learning > Models > Language Models Deep Learning > Optimization & Theory > Evaluation

Keywords

model adaptation linguistic knowledge bert fine-tuning contextualized embedding pre-trained transformer sentence-level probing probing accuracy

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020