2008 NIPS NeurIPS 2008

A Scalable Hierarchical Distributed Language Model

Abstract

Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs is their extremely long training and testing times. Morin and Bengio have proposed a hierarchical language model built around a binary tree of words that was two orders of magnitude faster than the non-hierarchical language model it was based on. However, it performed considerably worse than its non-hierarchical counterpart in spite of using a word tree created using expert knowledge. We introduce a fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data. We then show that the resulting models can outperform non-hierarchical models and achieve state-of-the-art performance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing
📈 Trend Setter — Distributed Learning
🧭 Keyword Pioneer — hierarchical language model
🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio
🌱 Topic Pioneer — Language Models
🐣 Hot Topic Early Bird — hierarchical classification