USTHB at ArAIEval’23 Shared Task: Disinformation Detection System based on Linguistic Feature Concatenation

Mohamed Lichouri; Khaled Lounnas; Aicha Zitouni; Houda Latrache; Rachida Djeradi

2023 EMNLP EMNLP 2023

USTHB at ArAIEval’23 Shared Task: Disinformation Detection System based on Linguistic Feature Concatenation

Abstract

AbstractIn this research paper, we undertake a comprehensive examination of several pivotal factors that impact the performance of Arabic Disinformation Detection in the ArAIEval’2023 shared task. Our exploration encompasses the influence of surface preprocessing, morphological preprocessing, the FastText vector model, and the weighted fusion of TF-IDF features. To carry out classification tasks, we employ the Linear Support Vector Classification (LSVC) model. In the evaluation phase, our system showcases significant results, achieving an F1 micro score of 76.70% and 50.46% for binary and multiple classification scenarios, respectively. These accomplishments closely correspond to the average F1 micro scores achieved by other systems submitted for the second subtask, standing at 77.96% and 64.85% for binary and multiple classification scenarios, respectively.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mohamed Lichouri , Khaled Lounnas , Aicha Zitouni , Houda Latrache , Rachida Djeradi

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Text Classification Machine Learning > Core Methods > Feature Selection Machine Learning > Core Methods > Support Vector Machine

Keywords

text classification support vector machine arabic language fasttext embedding disinformation detection term frequency-inverse document frequency

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023