Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection

Taehoon Kim; Jongwook Choi; Yonghyun Jeong; Haeun Noh; Jaejun Yoo; Seungryul Baek; Jongwon Choi

2025 ICCV ICCV 2025

Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection

Abstract

We introduce a deepfake video detection approach that exploits pixel-wise temporal inconsistencies, which traditional spatial frequency-based detectors often overlook. The traditional detectors represent temporal information merely by stacking spatial frequency spectra across frames, resulting in the failure to detect pixel-wise temporal artifacts. Our approach performs a 1D Fourier transform on the time axis for each pixel, extracting features highly sensitive to temporal inconsistencies, especially in areas prone to unnatural movements. To precisely locate regions containing the temporal artifacts, we introduce an attention proposal module trained in an end-to-end manner. Additionally, our joint transformer module effectively integrates pixel-wise temporal frequency features with spatio-temporal context features, expanding the range of detectable forgery artifacts. Our framework represents a significant advancement in deepfake video detection, providing robust performance across diverse and challenging detection scenarios.

🧭 Keyword Pioneer — pixel-wise analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Taehoon Kim , Jongwook Choi , Yonghyun Jeong , Haeun Noh , Jaejun Yoo , Seungryul Baek , Jongwon Choi

Topics

Computer Vision > Analysis > Anomaly Detection Computer Vision > Analysis > Video Understanding

Keywords

attention mechanism deepfake detection video forensics temporal frequency pixel-wise analysis

Download PDF

Related papers

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval 2025

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality 2025

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval 2025

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching 2025

Robust Dataset Condensation using Supervised Contrastive Learning 2025