Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

Haw-Shiuan Chang; Amol Agrawal; Ananya Ganesh; Anirudha Desai; Vinayak Mathur; Alfred Hough; Andrew McCallum

2018 NAACL NAACL 2018

Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

Abstract

AbstractWord sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable. This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like topics) and clusters the basis indexes in the ego network of each polysemous word. By adopting distributional inclusion vector embeddings as our basis formation model, we avoid the expensive step of nearest neighbor search that plagues other graph-based methods without sacrificing the quality of sense clusters. Experiments on three datasets show that our proposed method produces similar or better sense clusters and embeddings compared with previous state-of-the-art methods while being significantly more efficient.

🧭 Keyword Pioneer — non-negative vector embedding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing

Authors

Haw-Shiuan Chang , Amol Agrawal , Ananya Ganesh , Anirudha Desai , Vinayak Mathur , Alfred Hough , Andrew McCallum

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Core Methods > Embedding Learning

Keywords

graph-based method word sense induction sense clustering non-negative vector embedding distributional inclusion

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018