Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation

Jingxi Chen; Brandon Y. Feng; Haoming Cai; Tianfu Wang; Levi Burner; Dehao Yuan; Cornelia Fermuller; Christopher A. Metzler; Yiannis Aloimonos

2025 CVPR CVPR 2025

Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation

Abstract

Video Frame Interpolation aims to recover realistic missing frames between observed frames, generating a high-frame-rate video from a low-frame-rate video. However, without additional guidance, large motion between frames makes this problem ill-posed. Event-based Video Frame Interpolation (EVFI) addresses this challenge by using sparse, high-temporal-resolution event measurements as motion guidance. This guidance allows EVFI methods to significantly outperform frame-only methods. However, to date, EVFI methods have relied upon a limited set of paired event-frame training data, severely limiting their performance and generalization capabilities. In this work, we overcome the limited data challenge by adapting pre-trained video diffusion models trained on internet-scale datasets to EVFI. We experimentally validate our approach on real-world EVFI datasets, including a new one we introduce. Our method outperforms existing methods and generalizes across cameras far better than existing approaches.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jingxi Chen , Brandon Y. Feng , Haoming Cai , Tianfu Wang , Levi Burner , Dehao Yuan , Cornelia Fermuller , Christopher A. Metzler , Yiannis Aloimonos

Topics

Machine Learning > Application Areas > Domain Adaptation Deep Learning > Models > Diffusion Models Computer Vision > Processing > Video Processing

Keywords

video frame interpolation event camera diffusion model temporal resolution motion guidance

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025