Interpolated Spectral NGram Language Models

Ariadna Quattoni; Xavier Carreras

2019 ACL ACL 2019

Interpolated Spectral NGram Language Models

Abstract

AbstractSpectral models for learning weighted non-deterministic automata have nice theoretical and algorithmic properties. Despite this, it has been challenging to obtain competitive results in language modeling tasks, for two main reasons. First, in order to capture long-range dependencies of the data, the method must use statistics from long substrings, which results in very large matrices that are difficult to decompose. The second is that the loss function behind spectral learning, based on moment matching, differs from the probabilistic metrics used to evaluate language models. In this work we employ a technique for scaling up spectral learning, and use interpolated predictions that are optimized to maximize perplexity. Our experiments in character-based language modeling show that our method matches the performance of state-of-the-art ngram models, while being very fast to train.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — ngram model

🐣 Hot Topic Early Bird — language model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Ariadna Quattoni , Xavier Carreras

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Optimization Natural Language Processing > Generation > Language Modeling Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Optimization & Theory > Stochastic Methods Deep Learning > Learning Types > Representation Learning Deep Learning > Models > Language Models

Keywords

spectral learning language model weighted automaton ngram model n-gram model character modeling character-based modeling interpolated prediction

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019