2026 EACL EACL 2026

Stylometric Approach to AI-generated Texts. An Analysis of Contemporary French-Language Literature

Abstract

AbstractThe article focuses on a stylometric analysis of authentic literary texts and thematically related texts generated by large language models. The texts under study represent a fairly broad cross-section of twentieth-century French literature. Five models were used to generate the texts (ChatGPT 4-o, GPT 4-o mini, DeepSeek v.3, c4ai-command-r-plus, and c4ai-command-a). The original human-written stories of approximately 20,000 characters were summarized, and new narratives were then generated on the basis of these abstracts. In terms of plot and style, they were intended to resemble the originals. The research carried out with TF-IDF of the most frequent words showed that texts generated by specific LLMs and written by humans cluster relatively well as distinct groups. The experiments also showed that the "authorial" specificity of machine-generated texts partly matches the original clustering of human-written source texts.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio