NSP-BERT: A Prompt-based Few-Shot Learner through an Original Pre-training Task —— Next Sentence Prediction

Yi Sun; Yu Zheng; Chao Hao; Hangping Qiu

2022 COLING COLING 2022

NSP-BERT: A Prompt-based Few-Shot Learner through an Original Pre-training Task —— Next Sentence Prediction

Abstract

AbstractUsing prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually most prompt-based methods are token-level such as PET based on mask language model (MLM). In this paper, we attempt to accomplish several NLP tasks in the zero-shot and few-shot scenarios using a BERT original pre-training task abandoned by RoBERTa and other models——Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. NSP-BERT can be applied to a variety of tasks based on its properties. We present an NSP-tuning approach with binary cross-entropy loss for single-sentence classification tasks that is competitive compared to PET and EFL. By continuing to train BERT on RoBERTa’s corpus, the model’s performance improved significantly, which indicates that the pre-training corpus is another important determinant of few-shot besides model size and prompt method.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yi Sun , Yu Zheng , Chao Hao , Hangping Qiu

Topics

Machine Learning > Learning Types > Zero-Shot Learning Natural Language Processing > Resources & Methods > Large Language Models Machine Learning > Learning Paradigms > Few-Shot Learning Machine Learning > Learning Types > Few-Shot Learning Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Large Language Models Deep Learning > Learning Types > Transfer Learning

Keywords

zero-shot learning few-shot learning entity linking prompt-based learning pre-trained language model next sentence prediction mask language model binary cross-entropy

Download PDF

Related papers

MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation 2022

The Role of Context and Uncertainty in Shallow Discourse Parsing 2022

SelfMix: Robust Learning against Textual Label Noise with Self-Mixup Training 2022

Complicate Then Simplify: A Novel Way to Explore Pre-trained Models for Text Classification 2022

Repo4QA: Answering Coding Questions via Dense Retrieval on GitHub Repositories 2022