A Deep Decomposable Model for Disentangling Syntax and Semantics in Sentence Representation

Dingcheng Li; Hongliang Fei; Shaogang Ren; Ping Li

2021 EMNLP EMNLP 2021

A Deep Decomposable Model for Disentangling Syntax and Semantics in Sentence Representation

Abstract

AbstractRecently, disentanglement based on a generative adversarial network or a variational autoencoder has significantly advanced the performance of diverse applications in CV and NLP domains. Nevertheless, those models still work on coarse levels in the disentanglement of closely related properties, such as syntax and semantics in human languages. This paper introduces a deep decomposable model based on VAE to disentangle syntax and semantics by using total correlation penalties on KL divergences. Notably, we decompose the KL divergence term of the original VAE so that the generated latent variables can be separated in a more clear-cut and interpretable way. Experiments on benchmark datasets show that our proposed model can significantly improve the disentanglement quality between syntactic and semantic representations for semantic similarity tasks and syntactic similarity tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — syntax-semantic separation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dingcheng Li , Hongliang Fei , Shaogang Ren , Ping Li

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Models > Variational Inference Deep Learning > Learning Types > Representation Learning Artificial Intelligence > Core AI > Representation Learning

Keywords

representation learning latent variable disentangled representation variational autoencoder total correlation syntax-semantic separation

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021