2025 NAACL NAACL 2025

Multi-Strategy Named Entity Recognition System for Ancient Chinese

Abstract

AbstractWe present a multi-strategy Named Entity Recognition (NER) system for ancient Chi-nese texts in EvaHan2025. Addressing dataset heterogeneity, we use a Conditional Random Field (CRF) for Tasks A and C to handle six entity types’ complex dependencies, and a lightweight Softmax classifier for Task B’s simpler three-entity tagset. Ablation studies on training data confirm CRF’s superiority in capturing sequence dependencies and Softmax’s computational advantage for simpler tasks. On blind tests, our system achieves F1-scores of 83.94%, 88.31%, and 82.15% for Test A, B, and C—outperforming baselines by 2.46%, 0.81%, and 9.75%. With an overall F1 improvement of 4.30%, it excels across historical and medical domains. This adaptability enhances knowledge extraction from ancient texts, offering a scalable NER framework for low-resource, complex languages.

🌉 Interdisciplinary Bridge — Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — multi-strategy approach
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio