Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Jason Wei; Chengyu Huang; Soroush Vosoughi; Yu Cheng; Shiqi Xu

2021 NAACL NAACL 2021

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Abstract

AbstractFew-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category. This paper explores data augmentation—a technique particularly suitable for training with limited data—for this few-shot, highly-multiclass text classification setting. On four diverse text classification tasks, we find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average. To further boost performance, we present a simple training strategy called curriculum data augmentation, which leverages curriculum learning by first training on only original examples and then introducing augmented data as training progresses. We explore a two-stage and a gradual schedule, and find that, compared with standard single-stage training, curriculum data augmentation trains faster, improves performance, and remains robust to high amounts of noising from augmentation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jason Wei , Chengyu Huang , Soroush Vosoughi , Yu Cheng , Shiqi Xu

Topics

Artificial Intelligence > Learning Paradigms > Few-Shot Learning Machine Learning > Application Areas > Data Augmentation Natural Language Processing > Applications > Text Classification

Keywords

few-shot learning text classification curriculum learning data augmentation triplet network

Download PDF

Related papers

Knowledge Router: Learning Disentangled Representations for Knowledge Graphs 2021

Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks 2021

Abstract Meaning Representation Guided Graph Encoding and Decoding for Joint Information Extraction 2021

Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing 2021

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers 2021