2026 EACL EACL 2026

Ranking Human and LLM Texts Using Locality Statistics

Abstract

AbstractThe paper extends the Data Movement Distance (DMD) – a metric defined to measure the locality in computer memory – to text by defining a normalized version called nDMD. A key feature of nDMD is a new term designed to better characterize low-frequency tokens. By evaluating nDMD on English subset of the M4 dataset and GenAI detection shared task, the paper shows three key findings. First, nDMD is systematically higher in human-written text than in machine-generated text. Second, nDMD-based features not only outperform frequency baselines but also improve overall performance when combined. Finally, the proposed DMD normalization is more effective in distinguishing human and machine text than alternative normalization approaches.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — data movement distance
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio