Robust Transfer Learning with Pretrained Language Models through Adapters

Wenjuan Han; Bo Pang; Ying Nian Wu

2021 ACL ACL 2021

Robust Transfer Learning with Pretrained Language Models through Adapters

Abstract

AbstractTransfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific pretraining is often not robust. In particular, the performance considerably varies as the random seed changes or the number of pretraining and/or fine-tuning iterations varies, and the fine-tuned model is vulnerable to adversarial attack. We propose a simple yet effective adapter-based approach to mitigate these issues. Specifically, we insert small bottleneck layers (i.e., adapter) within each layer of a pretrained model, then fix the pretrained layers and train the adapter layers on the downstream task data, with (1) task-specific unsupervised pretraining and then (2) task-specific supervised training (e.g., classification, sequence labeling). Our experiments demonstrate that such a training scheme leads to improved stability and adversarial robustness in transfer learning to various downstream tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — parameter efficient tuning

🐣 Hot Topic Early Bird — parameter efficient

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wenjuan Han , Bo Pang , Ying Nian Wu

Topics

Artificial Intelligence > Core AI > Model Compression Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Learning Paradigms > Self-Supervised Learning

Keywords

model compression adversarial robustness transfer learning parameter efficient language model parameter-efficient fine-tuning model fine-tuning parameter-efficient tuning pretrained language model adapter module parameter efficient tuning large language model task-specific pretraining adapter-based transfer learning

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021