2026 EACL EACL 2026

Armenian AutoEpiDoc: Automated Extraction and Encoding of Armenian Inscriptions into EpiDoc TEI/XML

Abstract

AbstractArmenian epigraphy is extensively documented in printed scholarly corpora, yet lacks machine-readable editions that support interoperability or computational analysis. In this paper, we present Armenian AutoEpiDoc, a system that automatically converts expert-verified Armenian inscription records into EpiDoc-compliant TEI/XML files. Operating on curated and domain-validated data, AutoEpiDoc maps Armenian-specific metadata to EpiDoc structures through rule-based templates and schema-aware validation. The workflow significantly reduces manual encoding effort and provides a scalable path toward producing digital editions and integrating Armenian inscriptions into international epigraphic infrastructures.

🧭 Keyword Pioneer — epigraphic datum
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Speech & Audio