Can Network Embedding of Distributional Thesaurus Be Combined with Word Vectors for Better Representation?

Abhik Jana; Pawan Goyal

2018 NAACL NAACL 2018

Can Network Embedding of Distributional Thesaurus Be Combined with Word Vectors for Better Representation?

Abstract

AbstractDistributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model (Word2vec) or dense count based model (GloVe), others attempt to represent these in a distributional thesaurus network structure where the neighborhood of a word is a set of words having adequate context overlap. Being motivated by recent surge of research in network embedding techniques (DeepWalk, LINE, node2vec etc.), we turn a distributional thesaurus network into dense word vectors and investigate the usefulness of distributional thesaurus embedding in improving overall word representation. This is the first attempt where we show that combining the proposed word representation obtained by distributional thesaurus embedding with the state-of-the-art word representations helps in improving the performance by a significant margin when evaluated against NLP tasks like word similarity and relatedness, synonym detection, analogy detection. Additionally, we show that even without using any handcrafted lexical resources we can come up with representations having comparable performance in the word similarity and relatedness tasks compared to the representations where a lexical resource has been used.

❓ The Questioner

🌉 Interdisciplinary Bridge — Knowledge & Reasoning and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — network embedding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Abhik Jana , Pawan Goyal

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Resources & Methods > Text Representation Knowledge & Reasoning > Representation > Knowledge Representation Knowledge & Reasoning > Reasoning > Graph Embeddings

Keywords

text representation semantic similarity network embedding word similarity word vector distributional thesaurus synonym detection

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018