AraDetector at ArAIEval Shared Task: An Ensemble of Arabic-specific pre-trained BERT and GPT-4 for Arabic Disinformation Detection

Ahmed Bahaaulddin; Vian Sabeeh; Hanan Belhaj; Serry Sibaee; Samar Ahmad; Ibrahim Khurfan; Abdullah Alharbi

2023 EMNLP EMNLP 2023

AraDetector at ArAIEval Shared Task: An Ensemble of Arabic-specific pre-trained BERT and GPT-4 for Arabic Disinformation Detection

Abstract

AbstractThe rapid proliferation of disinformation through social media has become one of the most dangerous means to deceive and influence people’s thoughts, viewpoints, or behaviors due to social media’s facilities, such as rapid access, lower cost, and ease of use. Disinformation can spread through social media in different ways, such as fake news stories, doctored images or videos, deceptive data, and even conspiracy theories, thus making detecting disinformation challenging. This paper is a part of participation in the ArAIEval competition that relates to disinformation detection. This work evaluated four models: MARBERT, the proposed ensemble model, and two tests over GPT-4 (zero-shot and Few-shot). GPT-4 achieved micro-F1 79.01% while the ensemble method obtained 76.83%. Despite no improvement in the micro-F1 score on the dev dataset using the ensemble approach, we still used it for the test dataset predictions. We believed that merging different classifiers might enhance the system’s prediction accuracy.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ahmed Bahaaulddin , Vian Sabeeh , Hanan Belhaj , Serry Sibaee , Samar Ahmad , Ibrahim Khurfan , Abdullah Alharbi

Topics

Machine Learning > Core Methods > Classification Deep Learning > Architectures > Transformers Deep Learning > Learning Types > Classification Deep Learning > Learning Types > Ensemble Learning Artificial Intelligence > Core AI > Natural Language Processing

Keywords

transformer architecture few-shot learning ensemble learning text classification ensemble method pretrained language model arabic language disinformation detection arabic nlp large language model

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023