2020
ACL
ACL 2020
Topic Balancing with Additive Regularization of Topic Models
Abstract
AbstractThis article proposes a new approach for building topic models on unbalanced collections in topic modelling, based on the existing methods and our experiments with such methods. Real-world data collections contain topics in various proportions, and often documents of the relatively small theme become distributed all over the larger topics instead of being grouped into one topic. To address this issue, we design a new regularizer for Theta and Phi matrices in probabilistic Latent Semantic Analysis (pLSA) model. We make sure this regularizer increases the quality of topic models, trained on unbalanced collections. Besides, we conceptually support this regularizer by our experiments.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning
🧭
Keyword Pioneer
— additive regularization
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing
Authors
Topics
Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling
Machine Learning > Core Methods > Clustering
Machine Learning > Optimization & Theory > Optimization
Machine Learning > Core Methods > Topic Modeling
Natural Language Processing > Applications > Topic Modeling
Machine Learning > Optimization & Theory > Regularization