2024 SEMEVAL SemEval 2024

Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection

Abstract

AbstractWe propose a novel approach for machine-generated text detection using a RoBERTa model with weighted layer averaging and AdaLoRA for parameter-efficient fine-tuning. Our method incorporates information from all model layers, capturing diverse linguistic cues beyond those accessible from the final layer alone. To mitigate potential overfitting and improve generalizability, we leverage AdaLoRA, which injects trainable low-rank matrices into each Transformer layer, significantly reducing the number of trainable parameters. Furthermore, we employ data mixing to ensure our model encounters text from various domains and generators during training, enhancing its ability to generalize to unseen data. This work highlights the potential of combining layer-wise information with parameter-efficient fine-tuning and data mixing for effective machine-generated text detection.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio