RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

Jaehyung Kim; Yuning Mao; Rui Hou; Hanchao Yu; Davis Liang; Pascale Fung; Qifan Wang; Fuli Feng; Lifu Huang; Madian Khabsa

2023 EMNLP EMNLP 2023

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

Abstract

AbstractFine-tuning pre-trained language models (LMs) has become the de facto standard in many NLP tasks. Nevertheless, fine-tuned LMs are still prone to robustness issues, such as adversarial robustness and model calibration. Several perspectives of robustness for LMs have been studied independently, but lacking a unified consideration in multiple perspectives. In this paper, we propose Robustifying LMs via Adversarial perturbation with Selective Training (RoAST), a simple yet effective fine-tuning technique to enhance the multi-perspective robustness of LMs in a unified way. RoAST effectively incorporates two important sources for the model robustness, robustness on the perturbed inputs and generalizable knowledge in pre-trained LMs. To be specific, RoAST introduces adversarial perturbation during fine-tuning while the model parameters are selectively updated upon their relative importance to minimize unnecessary deviation. Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — selective training

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jaehyung Kim , Yuning Mao , Rui Hou , Hanchao Yu , Davis Liang , Pascale Fung , Qifan Wang , Fuli Feng , Lifu Huang , Madian Khabsa

Topics

Machine Learning > Learning Types > Adversarial Learning Natural Language Processing > Generation > Language Modeling Machine Learning > Learning Types > Transfer Learning Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Learning Types > Adversarial Learning

Keywords

adversarial robustness model calibration model robustness adversarial training adversarial perturbation language model pre-trained language model selective training

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023