2018
EMNLP
EMNLP 2018
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Abstract
AbstractThe lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— slot labeling
🐣
Hot Topic Early Bird
— hierarchical clustering
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Clustering
Machine Learning > Learning Types > Unsupervised Learning
Natural Language Processing > Applications > Intent Classification
Natural Language Processing > Applications > Dialogue Systems
Machine Learning > Learning Paradigms > Unsupervised Learning
Artificial Intelligence > Core AI > Dialogue Systems