2026 EACL EACL 2026

Olga Snissarenko at AbjadMed: Arabic Clinical Text Classification with AraBERT: Results from the AbjadMed Shared Task

Abstract

AbstractWe present a solution for the Arabic medical text classification task, formulated as a multi-class classification problem with 82 medical categories. The task is challenging due to severe class imbalance, long and heterogeneous input texts, and the presence of domain-specific medical terminology in Modern Standard Arabic. Our approach is based on fine-tuning pretrained AraBERT models with a focus on loss-level imbalance handling rather than architectural complexity. Through a systematic comparison of multiple AraBERT-based configurations, we show that class-weighted loss combined with simple mean pooling yields the strongest performance. Our best model achieves a macro-F1 score of 0.387 on the public evaluation set and 0.411 on the private test set.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio