Large Margin Neural Language Model

Jiaji Huang; Yi Li; Wei Ping; Liang Huang

2018 EMNLP EMNLP 2018

Large Margin Neural Language Model

Abstract

AbstractWe propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the “good” and “bad” sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — perplexity optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jiaji Huang , Yi Li , Wei Ping , Liang Huang

Topics

Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Loss Functions Natural Language Processing > Generation > Language Modeling Deep Learning > Models > Neural Networks Deep Learning > Models > Language Modeling

Keywords

large margin machine translation speech recognition large margin training margin maximization neural language model perplexity optimization

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018