2024 COLING COLING 2024

FoTo: Targeted Visual Topic Modeling for Focused Analysis of Short Texts

Abstract

AbstractGiven a corpus of documents, focused analysis aims to find topics relevant to aspects that a user is interested in. The aspects are often expressed by a set of keywords provided by the user. Short texts such as microblogs and tweets pose several challenges to this task because the sparsity of word co-occurrences may hinder the extraction of meaningful and relevant topics. Moreover, most of the existing topic models perform a full corpus analysis that treats all topics equally, which may make the learned topics not be on target. In this paper, we propose a novel targeted topic model for semantic short-text embedding which aims to learn all topics and low-dimensional visual representations of documents, while preserving relevant topics for focused analysis of short texts. To preserve the relevant topics in the visualization space, we propose jointly modeling topics and the pairwise document ranking based on document-keyword distances in the visualization space. The extensive experiments on several real-world datasets demonstrate the effectiveness of our proposed model in terms of targeted topic modeling and visualization.

🧭 Keyword Pioneer — focused analysis
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors