Language Modeling with Sparse Product of Sememe Experts

Yihong Gu; Jun Yan; Hao Zhu; Zhiyuan Liu; Ruobing Xie; Maosong Sun; Fen Lin; Leyu Lin

2018 EMNLP EMNLP 2018

Language Modeling with Sparse Product of Sememe Experts

Abstract

AbstractMost language modeling methods rely on large-scale data to statistically learn the sequential patterns of words. In this paper, we argue that words are atomic language units but not necessarily atomic semantic units. Inspired by HowNet, we use sememes, the minimum semantic units in human languages, to represent the implicit semantics behind words for language modeling, named Sememe-Driven Language Model (SDLM). More specifically, to predict the next word, SDLM first estimates the sememe distribution given textual context. Afterwards, it regards each sememe as a distinct semantic expert, and these experts jointly identify the most probable senses and the corresponding word. In this way, SDLM enables language models to work beyond word-level manipulation to fine-grained sememe-level semantics, and offers us more powerful tools to fine-tune language models and improve the interpretability as well as the robustness of language models. Experiments on language modeling and the downstream application of headline generation demonstrate the significant effectiveness of SDLM.

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — sparse expert

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yihong Gu , Jun Yan , Hao Zhu , Zhiyuan Liu , Ruobing Xie , Maosong Sun , Fen Lin , Leyu Lin

Topics

Natural Language Processing > Generation > Language Modeling Natural Language Processing > Generation > Text Generation Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Language Modeling

Keywords

text generation language modeling headline generation semantic representation word prediction neural network sparse expert

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018