2026 EACL EACL 2026

MetaSwarm at AbjadMed: Forensic Optimization and Class-Balanced Discovery for Medical Diglossia in Abjad Scripts

Abstract

AbstractThe classification of diglossic medical text presents a high-dimensional challenge defined by extreme class imbalance (N = 82) and the orthographic ambiguity of unvocalized Abjad scripts. While standard supervised learning often collapses into majority-class prediction due to the "Long Tail" distribution, we intro- duce a Human-in-the-Loop Forensic Opti- mization framework. Unlike static end-to-end pipelines, our approach decouples strategic hy- perparameter tuning from high-throughput tac- tical execution (Elastic Compute). We lever- age a rigorous Class-Balanced Focal Loss (CBFL) derived from the "Effective Number of Samples" theory (En) to stabilize the de- cision manifold against stochastic class domi- nance. Using a CAMELBERT-DA backbone optimized via a custom weighted trainer on Dual H200 GPUs, our system achieved a ro- bust Public Leaderboard score of 0.3588. We further perform a "Linguistic Error Topology" analysis, utilizing UMAP projections and atten- tion saliency, to demonstrate that generalization gaps are driven by dialectal "Constraint Drift" rather than stochastic model failure.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — abjad script
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

Authors