DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Zhihui Chen; Kai He; Yucheng Huang; Yunxiao Zhu; Mengling Feng

2025 EMNLP EMNLP 2025

DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Abstract

AbstractDetecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. However, current zero-shot detectors, while effective on general text, often fail when applied to specialized content due to domain shift. We provide a theoretical analysis showing this failure is fundamentally linked to the KL divergence between human, detector, and source text distributions. To address this, we propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments on medical and legal datasets show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold. In adversarial settings, DivScore demonstrates superior robustness to other baselines, achieving on average 22.8% advantage in AUROC and 29.5% in recall.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — entropy-based scoring

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhihui Chen , Kai He , Yucheng Huang , Yunxiao Zhu , Mengling Feng

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Application Areas > Domain Adaptation Artificial Intelligence > Learning Paradigms > Zero-Shot Learning

Keywords

domain adaptation kl divergence zero-shot detection specialized domain entropy-based scoring

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025