VideoGigaGAN: Towards Detail-rich Video Super-Resolution

Yiran Xu; Taesung Park; Richard Zhang; Yang Zhou; Eli Shechtman; Feng Liu; Jia-Bin Huang; Difan Liu

2025 CVPR CVPR 2025

VideoGigaGAN: Towards Detail-rich Video Super-Resolution

Abstract

Video super-resolution (VSR) models achieve temporal consistency but often produce blurrier results than their image-based counterparts due to limited generative capacity. This prompts the question: can we adapt a generative image upsampler for VSR while preserving temporal consistency? We introduce VideoGigaGAN, a new generative VSR model that combines high-frequency detail with temporal stability, building on the large-scale GigaGAN image upsampler. Simple adaptations of GigaGAN for VSR led to flickering issues, so we propose techniques to enhance temporal consistency. We validate the effectiveness of VideoGigaGAN by comparing it with state-of-the-art VSR models on public datasets and showcasing video results with 8x upsampling.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — generative upsampling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yiran Xu , Taesung Park , Richard Zhang , Yang Zhou , Eli Shechtman , Feng Liu , Jia-Bin Huang , Difan Liu

Topics

Deep Learning > Architectures > Transformers Deep Learning > Models > Generative Models Computer Vision > Analysis > Depth Estimation Computer Vision > Generation > Video Generation Computer Vision > Processing > Video Processing

Keywords

video generation image reconstruction video super-resolution generative model temporal consistency image upsampling generative upsampling video upsampling image upscaler

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025