Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Chunxiao Li; Xiaoxiao Wang; Meiling Li; Boming Miao; peng sun; Yunjian Zhang; Xiangyang Ji; Yao Zhu

2025 ICCV ICCV 2025

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Abstract

With the rapid advancement of generative models, highly realistic image synthesis has posed new challenges to digital security and media credibility. Although AI-generated image detection methods have partially addressed these concerns, a substantial research gap remains in evaluating their performance under complex real-world conditions. This paper introduces the Real-World Robustness Dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions: 1) Scenario Generalization - RRDataset encompasses high-quality images from seven major scenarios (War & Conflict, Disasters & Accidents, Political & Social Events, Medical & Public Health, Culture & Religion, Labor & Production, and everyday life), addressing existing dataset gaps from a content perspective. 2) Internet Transmission Robustness - examining detector performance on images that have undergone multiple rounds of sharing across various social media platforms.3) Re-digitization Robustness - assessing model effectiveness on images altered through four distinct re-digitization methods.We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study involving 192 participants to investigate human few-shot learning capabilities in detecting AI-generated images. The benchmarking results reveal the limitations of current AI detection methods under real-world conditions and underscore the importance of drawing on human adaptability to develop more robust detection algorithms. Our dataset is publicly available at: https://zenodo.org/records/14963880.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning

🧭 Keyword Pioneer — ai image detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chunxiao Li , Xiaoxiao Wang , Meiling Li , Boming Miao , peng sun , Yunjian Zhang , Xiangyang Ji , Yao Zhu

Topics

Artificial Intelligence > Core AI > Interpretability Artificial Intelligence > Core AI > Responsible AI Machine Learning > Application Areas > Domain Generalization Computer Vision > Analysis > Anomaly Detection Machine Learning > Learning Types > Evaluation

Keywords

domain generalization generative model benchmark dataset ai-generated image detection robustness evaluation image forensics ai image detection real-world scenario

Download PDF

Related papers

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval 2025

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality 2025

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval 2025

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching 2025

Robust Dataset Condensation using Supervised Contrastive Learning 2025