Factorising Meaning and Form for Intent-Preserving Paraphrasing

Tom Hosking; Mirella Lapata

2021 ACL ACL 2021

Factorising Meaning and Form for Intent-Preserving Paraphrasing

Abstract

AbstractWe propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form. Our model combines a careful choice of training objective with a principled information bottleneck, to induce a latent encoding space that disentangles meaning and form. We train an encoder-decoder model to reconstruct a question from a paraphrase with the same meaning and an exemplar with the same surface form, leading to separated encoding spaces. We use a Vector-Quantized Variational Autoencoder to represent the surface form as a set of discrete latent variables, allowing us to use a classifier to select a different surface form at test time. Crucially, our method does not require access to an external source of target exemplars. Extensive experiments and a human evaluation show that we are able to generate paraphrases with a better tradeoff between semantic preservation and syntactic novelty compared to previous methods.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — intent preservation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tom Hosking , Mirella Lapata

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Stochastic Processes Deep Learning > Models > Variational Inference Natural Language Processing > Applications > Text Generation Deep Learning > Learning Types > Representation Learning

Keywords

representation learning paraphrase generation vector quantization latent representation disentangled representation variational autoencoder semantic preservation encoder-decoder model surface form intent preservation intent-preserving paraphrase

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021