Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems

Christos Vlachos; Themos Stafylakis; Ion Androutsopoulos

2024 ACL ACL 2024

Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems

Abstract

AbstractCreating effective and reliable task-oriented dialog systems (ToDSs) is challenging, not only because of the complex structure of these systems, but also due to the scarcity of training data, especially when several modules need to be trained separately, each one with its own input/output training examples. Data augmentation (DA), whereby synthetic training examples are added to the training data, has been successful in other NLP systems, but has not been explored as extensively in ToDSs. We empirically evaluate the effectiveness of DA methods in an end-to-end ToDS setting, where a single system is trained to handle all processing stages, from user inputs to system outputs. We experiment with two ToDSs (UBAR, GALAXY) on two datasets (MultiWOZ, KVRET). We consider three types of DA methods (word-level, sentence-level, dialog-level), comparing eight DA methods that have shown promising results in ToDSs and other NLP systems. We show that all DA methods considered are beneficial, and we highlight the best ones, also providing advice to practitioners. We also introduce a more challenging few-shot cross-domain ToDS setting, reaching similar conclusions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Christos Vlachos , Themos Stafylakis , Ion Androutsopoulos

Topics

Machine Learning > Application Areas > Data Augmentation Natural Language Processing > Generation > Dialogue Systems Natural Language Processing > Applications > Dialogue Systems Machine Learning > Learning Types > Data Augmentation Deep Learning > Learning Types > Data Augmentation Artificial Intelligence > Core AI > Dialogue Systems

Keywords

few-shot learning data augmentation task-oriented dialogue end-to-end learning end-to-end training dialogue system task-oriented dialog system

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024