Span Fine-tuning for Pre-trained Language Models

Rongzhou Bao; Zhuosheng Zhang; Hai Zhao

2021 EMNLP EMNLP 2021

Span Fine-tuning for Pre-trained Language Models

Abstract

AbstractPre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs. However, given that span-level clues are introduced and fixed in pre-training, previous methods are time-consuming and lack of flexibility. To alleviate the inconvenience, this paper presents a novel span fine-tuning method for PrLMs, which facilitates the span setting to be adaptively determined by specific downstream tasks during the fine-tuning phase. In detail, any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary. Then the segmentation information will be sent through a hierarchical CNN module together with the representation outputs of the PrLM and ultimately generate a span-enhanced representation. Experiments on GLUE benchmark show that the proposed span fine-tuning method significantly enhances the PrLM, and at the same time, offer more flexibility in an efficient way.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — span fine-tuning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Rongzhou Bao , Zhuosheng Zhang , Hai Zhao

Topics

Deep Learning > Techniques > Pretraining Natural Language Processing > Resources & Methods > Large Language Models Machine Learning > Learning Types > Transfer Learning Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Techniques > Fine-Tuning Deep Learning > Learning Types > Fine-Tuning

Keywords

text representation pre-trained language model glue benchmark span fine-tuning hierarchical cnn span-level representation span-level information

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021