Exploring Causal Mechanisms for Machine Text Detection Methods

KiYoon Yoo; Wonhyuk Ahn; Yeji Song; Nojun Kwak

2024 NAACL NAACL 2024

Exploring Causal Mechanisms for Machine Text Detection Methods

Abstract

AbstractThe immense attraction towards text generation garnered by ChatGPT has spurred the need for discriminating machine-text from human text. In this work, we provide preliminary evidence that the scores computed by existing zero-shot and supervised machine-generated text detection methods are not solely determined by the generated texts, but are affected by prompts and real texts as well. Using techniques from causal inference, we show the existence of backdoor paths that confounds the relationships between text and its detection score and how the confounding bias can be partially mitigated. We open up new research directions in identifying other factors that may be interwoven in the detection of machine text. Our study calls for a deeper investigation into which kinds of prompts make the detection of machine text more difficult or easier

🌉 Interdisciplinary Bridge — Knowledge & Reasoning and Machine Learning

🧭 Keyword Pioneer — prompt confounding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

KiYoon Yoo , Wonhyuk Ahn , Yeji Song , Nojun Kwak

Topics

Machine Learning > Learning Types > Zero-Shot Learning Knowledge & Reasoning > Reasoning > Causal Inference

Keywords

causal inference zero-shot learning machine text detection prompt confounding detection bia

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024