2026 WACV WACV 2026

Denoise, Divide, Distill, and Predict (D3P): Towards Forecasting Long-horizon Real-world Anomaly from Normalcy

Abstract

Forecasting abnormal human behavior (AHB) in unconstrained real-world environments is critical for enabling proactive safety interventions. Unlike short-term anomaly detection, long-horizon forecasting offers a vital reaction window but remains underexplored due to three core challenges: (i) noisy, complex human-agent interactions; (ii) weak temporal coupling between normal observations and distant anomalies; and (iii) data scarcity limiting the scalability of autoregressive models. To address these, we propose \mathcal D ^3\mathcal P (Denoise, Divide, Distill, and Predict), a novel encoder-decoder framework that bridges denoised pasts with distilled autoregressive futures. Our Differential Past Encoder (DiPE) disentangles scene-level and object-level dynamics via differential attention, suppressing irrelevant interactions and enhancing discriminative cues. The Distilled Future Auto-Regressive Decoder (D-FAD) adopts a divide-and-conquer strategy, segmenting future queries into temporal chunks for sequential prediction, while leveraging distillation to balance robustness and latency. We validate our approach on the AHB-F benchmark, the only dataset dedicated to abnormal behavior forecasting, and further integrate D-FAD with several state-of-the-art methods. In all cases, our framework consistently outperforms prior work in both forecasting accuracy and computational efficiency.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio