2025 NAACL NAACL 2025

Seeds of Discourse: A Multilingual Corpus of Direct Quotations from African Media on Agricultural Biotechnologies

Abstract

AbstractDirect quotations play a crucial role in journalism by substantiating claims and enhancing persuasive communication. This makes news articles a rich resource for opinion mining, providing valuable insights into the topics they cover. This paper presents the first multilingual corpora (English and French) featuring both manually annotated (1,657) and automatically extracted (102,483) direct quotations related to agricultural biotechnologies from a curated list of Africa-based news sources. In addition, we provide 665 instances annotated for Aspect-Based Sentiment Analysis, enabling a fine-grained examination of sentiment toward key aspects of agricultural biotechnologies. These corpora are freely available to the research community for future work on media discourse surrounding agricultural biotechnologies.

🌉 Interdisciplinary Bridge — Interdisciplinary and Natural Language Processing
🧭 Keyword Pioneer — direct quotation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio