Increasing Robustness to Spurious Correlations using Forgettable Examples

Yadollah Yaghoobzadeh; Soroush Mehri; Remi Tachet des Combes; T. J. Hazen; Alessandro Sordoni

2021 EACL EACL 2021

Increasing Robustness to Spurious Correlations using Forgettable Examples

Abstract

AbstractNeural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks. Minority examples, i.e., examples that contradict the spurious correlations present in the majority of data points, have been shown to increase the out-of-distribution generalization of pre-trained language models. In this paper, we first propose using example forgetting to find minority examples without prior knowledge of the spurious correlations present in the dataset. Forgettable examples are instances either learned and then forgotten during training or never learned. We show empirically how these examples are related to minorities in our training sets. Then, we introduce a new approach to robustify models by fine-tuning our models twice, first on the full training data and second on the minorities only. We obtain substantial improvements in out-of-distribution generalization when applying our approach to the MNLI, QQP and FEVER datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — example forgetting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yadollah Yaghoobzadeh , Soroush Mehri , Remi Tachet des Combes , T. J. Hazen , Alessandro Sordoni

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Application Areas > Domain Generalization

Keywords

out-of-distribution generalization spurious correlation pretrained language model example forgetting minority example

Download PDF

Related papers

Joint Coreference Resolution and Character Linking for Multiparty Conversation 2021

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO 2021

Representations for Question Answering from Documents with Tables and Text 2021

Gender and Racial Fairness in Depression Research using Social Media 2021