TurQUaz at GenAI Detection Task 1:Dr. Perplexity or: How I Learned to Stop Worrying and Love the Finetuning

Kaan Efe Keleş; Mucahid Kutlu

2025 COLING COLING 2025

TurQUaz at GenAI Detection Task 1:Dr. Perplexity or: How I Learned to Stop Worrying and Love the Finetuning

Abstract

AbstractThis paper details our methods for addressing Task 1 of the GenAI Content Detection shared tasks, which focus on distinguishing AI-generated text from human-written content. The task comprises two subtasks: Subtask A, centered on English-only datasets, and Subtask B, which extends the challenge to multilingual data. Our approach uses a fine-tuned XLM-RoBERTa model for classification, complemented by features including perplexity and TF-IDF. While perplexity is commonly regarded as a useful indicator for identifying machine-generated text, our findings suggest its limitations in multi-model and multilingual contexts. Our approach ranked 6th in Subtask A, but a submission issue left our Subtask B unranked, where it would have placed 23rd.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Kaan Efe Keleş , Mucahid Kutlu

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Supervised Learning Deep Learning > Models > Transformers

Keywords

text classification machine-generated text detection ai-generated text detection transformer fine-tuning multilingual classification perplexity scoring transformer model tf-idf feature

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025