2024 EACL EACL 2024

Detecting Structured Language Alternations in Historical Documents by Combining Language Identification with Fourier Analysis

Abstract

AbstractIn this study, we present a generalizable workflow to identify documents in a historic language with a nonstandard language and script combination, Armeno-Turkish. We introduce the task of detecting distinct patterns of multilinguality based on the frequency of structured language alternations within a document.

🌉 Interdisciplinary Bridge — Computer Science and Interdisciplinary
🧭 Keyword Pioneer — historical documents
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio