2019 ACL ACL 2019

Unsupervised Paraphrasing without Translation

Abstract

AbstractParaphrasing is an important task demonstrating the ability to abstract semantic content from its surface form. Recent literature on automatic paraphrasing is dominated by methods leveraging machine translation as an intermediate step. This contrasts with humans, who can paraphrase without necessarily being bilingual. This work proposes to learn paraphrasing models only from a monolingual corpus. To that end, we propose a residual variant of vector-quantized variational auto-encoder. Our experiments consider paraphrase identification, and paraphrasing for training set augmentation, comparing to supervised and unsupervised translation-based approaches. Monolingual paraphrasing is shown to outperform unsupervised translation in all contexts. The comparison with supervised MT is more mixed: monolingual paraphrasing is interesting for identification and augmentation but supervised MT is superior for generation.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — semantic content
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🐣 Hot Topic Early Bird — vector quantization