Spectral Methods for Supervised Topic Models

Yining Wang; Jun Zhu

2014 NIPS NeurIPS 2014

Spectral Methods for Supervised Topic Models

Abstract

Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each document. Existing inference methods are based on either variational approximation or Monte Carlo sampling. This paper presents a novel spectral decomposition algorithm to recover the parameters of supervised latent Dirichlet allocation (sLDA) models. The Spectral-sLDA algorithm is provably correct and computationally efficient. We prove a sample complexity bound and subsequently derive a sufficient condition for the identifiability of sLDA. Thorough experiments on a diverse range of synthetic and real-world datasets verify the theory and demonstrate the practical effectiveness of the algorithm.

📈 Trend Setter — Topic Modeling

🐣 Hot Topic Early Bird — spectral method

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yining Wang , Jun Zhu

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Statistical Learning Machine Learning > Optimization & Theory > Theory Machine Learning > Learning Types > Supervised Learning Machine Learning > Core Methods > Topic Modeling

Keywords

latent dirichlet allocation parameter estimation tensor decomposition spectral decomposition topic model spectral method supervised topic model

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014