To make someone do something: mining alert-style directives in Bulgarian social media for low-resource language modelling

Ruslana Margova; Stanislav Penkov

2026 EACL EACL 2026

To make someone do something: mining alert-style directives in Bulgarian social media for low-resource language modelling

Abstract

AbstractThe work demonstrates how meaningful rhetorical signals can be isolated from a social media dataset even without pre-labelled data or predefined lexicons. By combining unsupervised mining with linguistic theory and interpretable machine learning, the research offers a scalable approach to understanding how language can shape political perception and behaviour in digital spaces.The study focuses on Bulgarian, a morphologically rich, relatively low-resource language, and produces reusable resources—alert constructions, post-level features, and trained classifiers—that are explicitly designed to support low-resource language modelling, including the training and evaluation of neural language models and LLMs for tasks such as content moderation and propaganda-alert detection. The finding that rhetorical salience, not just topical content, drives engagement has implications beyond Bulgarian: it suggests that how something is said may matter as much as what is said in determining a message’s viral potential and persuasive impact.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ruslana Margova , Stanislav Penkov

Topics

Machine Learning > Learning Types > Unsupervised Learning Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Multilingual NLP

Keywords

unsupervised learning text classification social media analysis interpretable machine learning low-resource language language modelling

Download PDF

Related papers

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health 2026

A Benchmark for Audio Reasoning Capabilities of Multimodal Large Language Models 2026

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection 2026

Generative Personality Simulation via Theory-Informed Structured Interview 2026

Word Surprisal Correlates with Sentential Contradiction in LLMs 2026