CIC-NLP at GenAI Detection Task 1: Advancing Multilingual Machine-Generated Text Detection

Tolulope Olalekan Abiola; Tewodros Achamaleh Bizuneh; Fatima Uroosa; Nida Hafeez; Grigori Sidorov; Olga Kolesnikova; Olumide Ebenezer Ojo

2025 COLING COLING 2025

CIC-NLP at GenAI Detection Task 1: Advancing Multilingual Machine-Generated Text Detection

Abstract

AbstractMachine-written texts are gradually becoming indistinguishable from human-generated texts, leading to the need to use sophisticated methods to detect them. Team CIC-NLP presents work in the Gen-AI Content Detection Task 1 at COLING 2025 Workshop: the focus of our work is on Subtask B of Task 1, which is the classification of text written by machines and human authors, with particular attention paid to identifying multilingual binary classification problem. Usng mBERT, we addressed the binary classification task using the dataset provided by the GenAI Detection Task team. mBERT acchieved a macro-average F1-score of 0.72 as well as an accuracy score of 0.73.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — multilingual binary classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Tolulope Olalekan Abiola , Tewodros Achamaleh Bizuneh , Fatima Uroosa , Nida Hafeez , Grigori Sidorov , Olga Kolesnikova , Olumide Ebenezer Ojo

Topics

Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Multilingual NLP Machine Learning > Learning Types > Supervised Learning Deep Learning > Models > Transformers

Keywords

binary classification text classification machine-generated text detection performance metric cross-lingual model multilingual classification multilingual binary classification mbert model macro-average f1

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025