Sparse Additive Text Models with Low Rank Background

Lei Shi

2013 NIPS NeurIPS 2013

Sparse Additive Text Models with Low Rank Background

Abstract

The sparse additive model for text modeling involves the sum-of-exp computing, with consuming costs for large scales. Moreover, the assumption of equal background across all classes/topics may be too strong. This paper extends to propose sparse additive model with low rank background (SAM-LRB), and simple yet efficient estimation. Particularly, by employing a double majorization bound, we approximate the log-likelihood into a quadratic lower-bound with the sum-of-exp terms absent. The constraints of low rank and sparsity are then simply embodied by nuclear norm and $\ell_1$-norm regularizers. Interestingly, we find that the optimization task in this manner can be transformed into the same form as that in Robust PCA. Consequently, parameters of supervised SAM-LRB can be efficiently learned using an existing algorithm for Robust PCA based on accelerated proximal gradient. Besides the supervised case, we extend SAM-LRB to also favor unsupervised and multifaceted scenarios. Experiments on real world data demonstrate the effectiveness and efficiency of SAM-LRB, showing state-of-the-art performances.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Language Modeling

🧭 Keyword Pioneer — text modeling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing

🌱 Topic Pioneer — Text Processing

🐣 Hot Topic Early Bird — topic modeling

Authors

Lei Shi

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Generation > Language Modeling Natural Language Processing > Resources & Methods > Text Representation Machine Learning > Core Methods > Dimensionality Reduction Machine Learning > Learning Types > Supervised Learning Mathematics & Optimization > Optimization > Sparse Optimization Machine Learning > Learning Types > Sparse Learning Natural Language Processing > Applications > Text Processing Machine Learning > Core Methods > Sparse Coding

Keywords

matrix factorization topic modeling low-rank approximation sparse optimization robust principal component analysis nuclear norm regularization low rank approximation text modeling nuclear norm sparse additive model accelerated proximal gradient

Download PDF

Related papers

Latent Structured Active Learning 2013

On Flat versus Hierarchical Classification in Large-Scale Taxonomies 2013

Generalized Method-of-Moments for Rank Aggregation 2013

Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections 2013

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent 2013