2025 COLING COLING 2025

CNLP-NITS-PP at GenAI Detection Task 3: Cross-Domain Machine-Generated Text Detection Using DistilBERT Techniques

Abstract

AbstractThis paper presents a Cross-domain Machine-Generated Text Detection model developed for the COLING 2025 Workshop on Detecting AI-generated Content (DAIGenC). As large language models evolve, detecting machine-generated text becomes increasingly challenging, particularly in contexts like misinformation and academic integrity. While current detectors perform well on unseen data, they remain vulnerable to adversarial strategies, including paraphrasing, homoglyphs, misspellings, synonyms, whitespace manipulations, etc. We introduce a framework to address these adversarial tactics designed to bypass detection systems by adversarial training. Our team DistilBERT-NITS detector placed 7th in the Non-Adversarial Attacks category, and Adversarial-submission-3 achieved 17th in the Adversarial Attacks category.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — cross-domain machine-generated text detection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio