Unsupervised Disambiguation of Syncretism in Inflected Lexicons

Ryan Cotterell; Christo Kirov; Sabrina J. Mielke; Jason Eisner

2018 NAACL NAACL 2018

Unsupervised Disambiguation of Syncretism in Inflected Lexicons

Abstract

AbstractLexical ambiguity makes it difficult to compute useful statistics of a corpus. A given word form might represent any of several morphological feature bundles. One can, however, use unsupervised learning (as in EM) to fit a model that probabilistically disambiguates word forms. We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones). Although this basic model does not consider a token’s context, that very property allows it to operate on a simple list of unigram type counts, partitioning each count among different analyses of that unigram. We discuss evaluation metrics for this novel task and report results on 5 languages.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — lexical disambiguation

🐣 Hot Topic Early Bird — expectation maximization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ryan Cotterell , Christo Kirov , Sabrina J. Mielke , Jason Eisner

Topics

Machine Learning > Learning Types > Unsupervised Learning Deep Learning > Architectures > Neural Networks

Keywords

unsupervised learning expectation maximization morphological analysis lexical disambiguation neural network

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018