Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Kai Xu; Ziwei Yu; Xin Wang; Michael Bi Mi; Angela Yao

2024 CVPR CVPR 2024

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Abstract

In video super-resolution it is common to use a frame-wise alignment to support the propagation of information over time. The role of alignment is well-studied for low-level enhancement in video but existing works overlook a critical step -- resampling. We show through extensive experiments that for alignment to be effective the resampling should preserve the reference frequency spectrum while minimizing spatial distortions. However most existing works simply use a default choice of bilinear interpolation for resampling even though bilinear interpolation has a smoothing effect and hinders super-resolution. From these observations we propose an implicit resampling-based alignment. The sampling positions are encoded by a sinusoidal positional encoding while the value is estimated with a coordinate network and a window-based cross-attention. We show that bilinear interpolation inherently attenuates high-frequency information while an MLP-based coordinate network can approximate more frequencies. Experiments on synthetic and real-world datasets show that alignment with our proposed implicit resampling enhances the performance of state-of-the-art frameworks with minimal impact on both compute and parameters.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

📈 Trend Setter — Architectures

🧭 Keyword Pioneer — implicit resampling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kai Xu , Ziwei Yu , Xin Wang , Michael Bi Mi , Angela Yao

Topics

Deep Learning > Architectures Computer Vision > Generation > Image Generation Computer Vision > Generation > Video Generation Computer Vision > Processing > Image Restoration Computer Vision > Processing > Video Processing Computer Vision > Processing > Image Processing

Keywords

image restoration video super-resolution image alignment implicit neural representation implicit neural network positional encoding frequency spectrum frame alignment neural network coordinate network spatial distortion implicit resampling

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024