2026 EACL EACL 2026

Narrative in Short German Prose: A Multi-Phenomenon Dataset for Computational Literary Analysis

Abstract

AbstractWe present the novel dataset GermAnProse, an annotated corpus consisting of four German short prose texts accompanied by an extensive set of narrative-focused annotations.As part of this dataset, we contribute an annotation scheme for mentions, speech, and character agency: Characters in Action (ChiA).GermAnProse also contains information on narrative phenomena: narrativity, semantic verb classes, and plot keyness.Moreover, we include reader reception data in the form of timing information for audiobook performances, indicating pauses between sentences and the time taken to read a specific sentence in a performance.We release the dataset, which contains more than 18,000 manually created standoff annotations in JSON format, enabling researchers to utilize this resource for further exploratory applications.

🌉 Interdisciplinary Bridge — Interdisciplinary and Natural Language Processing
🧭 Keyword Pioneer — character agency
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio