Supervised Topic Models

Jon D. Mcauliffe; David M. Blei

2007 NIPS NeurIPS 2007

Supervised Topic Models

Abstract

We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and web page popularity predicted from text descriptions. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Language Modeling

🧭 Keyword Pioneer — supervised topic models

🐣 Hot Topic Early Bird — variational inference

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌱 Topic Pioneer — Fine-Tuning

Authors

Jon D. Mcauliffe , David M. Blei

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Regression Machine Learning > Core Methods > Representation Learning Natural Language Processing > Generation > Language Modeling Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Text Representation Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Learning Types > Supervised Learning Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Fine-Tuning Machine Learning > Core Methods > Topic Modeling

Keywords

variational inference text classification latent dirichlet allocation topic modeling supervised learning document classification text regression knowledge retrieval knowledge injection supervised topic model large language model

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007