Data Augmentation for Text Generation Without Any Augmented Data

Wei Bi; Huayang Li; Jiacheng Huang

2021 ACL ACL 2021

Data Augmentation for Text Generation Without Any Augmented Data

Abstract

AbstractData augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples. In this work, we derive an objective to formulate the problem of data augmentation on text generation tasks without any use of augmented data constructed by specific mapping functions. Our proposed objective can be efficiently optimized and applied to popular loss functions on text generation tasks with a convergence rate guarantee. Experiments on five datasets of two text generation tasks show that our approach can approximate or even surpass popular data augmentation methods.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — loss function optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

📈 Trend Setter — Text Generation

🐣 Hot Topic Early Bird — convergence guarantee

Authors

Wei Bi , Huayang Li , Jiacheng Huang

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Data Augmentation Natural Language Processing > Generation > Text Generation Machine Learning > Learning Types > Data Augmentation Deep Learning > Learning Types > Text Generation

Keywords

data augmentation text generation loss function optimization neural text generation convergence guarantee loss function neural network

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021