Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation

Shudong Hao; Jordan Boyd-Graber; Michael J. Paul

2018 NAACL NAACL 2018

Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation

Abstract

AbstractMultilingual topic models enable document analysis across languages through coherent multilingual summaries of the data. However, there is no standard and effective metric to evaluate the quality of multilingual topics. We introduce a new intrinsic evaluation of multilingual topic models that correlates well with human judgments of multilingual topic coherence as well as performance in downstream applications. Importantly, we also study evaluation for low-resource languages. Because standard metrics fail to accurately measure topic quality when robust external resources are unavailable, we propose an adaptation model that improves the accuracy and reliability of these metrics in low-resource settings.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — adaptation model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Shudong Hao , Jordan Boyd-Graber , Michael J. Paul

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Resources & Methods > Multilingual NLP Machine Learning > Learning Types > Evaluation

Keywords

topic coherence low-resource language topic model intrinsic evaluation multilingual topic model adaptation model multilingual topic

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018