French GossipPrompts: Dataset For Prevention of Generating French Gossip Stories By LLMs

Msvpj Sathvik; Abhilash Dowpati; Revanth Narra

2024 EACL EACL 2024

French GossipPrompts: Dataset For Prevention of Generating French Gossip Stories By LLMs

Abstract

AbstractThe realm of Large Language Models (LLMs) is undergoing a continuous and dynamic transformation. These state-of-the-art LLMs showcase an impressive ability to craft narratives based on contextual cues, highlighting their skill in comprehending and producing text resembling human writing. However, there exists a potential risk: the potential inclination of LLMs to create gossips when prompted with specific contexts. These LLMs possess the capacity to generate stories rooted in the context provided by the prompts. Yet, this very capability carries a risk of generating gossips given the context as input. To mitigate this, we introduce a dataset named “French GossipPrompts” designed for identifying prompts that lead to the creation of gossipy content in the French language. This dataset employs binary classification, categorizing whether a given prompt generates gossip or not. The dataset comprises a total of 7253 individual prompts. We have developed classification models and achieved an accuracy of 89.95%.

🧭 Keyword Pioneer — gossip detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Msvpj Sathvik , Abhilash Dowpati , Revanth Narra

Topics

Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Multilingual NLP

Keywords

binary classification text classification french language prompt classification gossip detection

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024