Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model

Xianghua Fu; Ting Wang; Jing Li; Chong Yu; Wangwang Liu

2016 ACML ACML 2016

Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model

Abstract

We propose a Word-Topic Mixture(WTM) model to improve word representation and topic model simultaneously. Firstly, it introduces the initial external word embeddings into the Topical Word Embeddings(TWE) model based on Latent Dirichlet Allocation(LDA) model to learn word embeddings and topic vectors. Then the results learned from TWE are integrated in the LDA by defining the probability distribution of topic vectors-word embeddings according to the idea of latent feature model with LDA (LFLDA), meanwhile minimizing the KL divergence of the new topic-word distribution function and the original one. The experimental results prove that the WTM model performs better on word representation and topic detection compared with some state-of-the-art models.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Text Representation

🐣 Hot Topic Early Bird — word embedding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🧭 Keyword Pioneer — topical word embedding

Authors

Xianghua Fu , Ting Wang , Jing Li , Chong Yu , Wangwang Liu

Topics

Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Resources & Methods > Text Representation Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling

Keywords

latent dirichlet allocation probabilistic graphical model topic model word embedding word representation topical word embedding

Download PDF

Related papers

Echo State Hoeffding Tree Learning 2016

Cost Sensitive Online Multiple Kernel Classification 2016

Random Fourier Features For Operator-Valued Kernels 2016

Secure Approximation Guarantee for Cryptographically Private Empirical Risk Minimization 2016

Hierarchical Probabilistic Matrix Factorization with Network Topology for Multi-relational Social Network 2016