Sphere Embedding: An Application to Part-of-Speech Induction

Yariv Maron; Michael Lamar; Elie Bienenstock

2010 NIPS NeurIPS 2010

Sphere Embedding: An Application to Part-of-Speech Induction

Abstract

Motivated by an application to unsupervised part-of-speech tagging, we present an algorithm for the Euclidean embedding of large sets of categorical data based on co-occurrence statistics. We use the CODE model of Globerson et al. but constrain the embedding to lie on a high-dimensional unit sphere. This constraint allows for efficient optimization, even in the case of large datasets and high embedding dimensionality. Using k-means clustering of the embedded data, our approach efficiently produces state-of-the-art results. We analyze the reasons why the sphere constraint is beneficial in this application, and conjecture that these reasons might apply quite generally to other large-scale tasks.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — part-of-speech induction

🐣 Hot Topic Early Bird — unsupervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🌱 Topic Pioneer — Part-of-Speech Tagging

📈 Trend Setter — Part-of-Speech Tagging

Authors

Yariv Maron , Michael Lamar , Elie Bienenstock

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Understanding > Part-of-Speech Tagging Machine Learning > Learning Paradigms > Unsupervised Learning Natural Language Processing > Applications > Part-of-Speech Tagging

Keywords

unsupervised learning representation learning k-means clustering embedding learning part-of-speech induction part-of-speech tagging sphere embedding categorical data co-occurrence statistics euclidean embedding

Download PDF

Related papers

Link Discovery using Graph Feature Tracking 2010

Trading off Mistakes and Don't-Know Predictions 2010

A Novel Kernel for Learning a Neuron Model from Spike Train Data 2010

Decomposing Isotonic Regression for Efficiently Solving Large Problems 2010

Learning Kernels with Radiuses of Minimum Enclosing Balls 2010