2017
EACL
EACL 2017
An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters
Abstract
AbstractNoise Contrastive Estimation (NCE) is a learning procedure that is regularly used to train neural language models, since it avoids the computational bottleneck caused by the output softmax. In this paper, we attempt to explain some of the weaknesses of this objective function, and to draw directions for further developments. Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the bigram distribution, can solve these issues and provide stable and data-efficient learning.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— language model training
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Deep Learning > Architectures > Neural Networks
Natural Language Processing > Generation > Language Modeling
Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling
Machine Learning > Optimization & Theory > Stochastic Methods
Machine Learning > Learning Types > Deep Learning
Deep Learning > Optimization & Theory > Optimization