Revealing Disocclusions in Temporal View Synthesis Through Infilling Vector Prediction

Vijayalakshmi Kanchana; Nagabhushan Somraj; Suraj Yadwad; Rajiv Soundararajan

2022 WACV WACV 2022

Revealing Disocclusions in Temporal View Synthesis Through Infilling Vector Prediction

Abstract

We consider the problem of temporal view synthesis, where the goal is to predict a future video frame from the past frames using knowledge of the depth and relative camera motion. In contrast to revealing the disoccluded regions through intensity based infilling, we study the idea of an infilling vector to infill by pointing to a non-disoccluded region in the synthesized view. To exploit the structure of disocclusions created by camera motion during their infilling, we rely on two important cues, temporal correlation of infilling directions and depth. We design a learning framework to predict the infilling vector by computing a temporal prior that reflects past infilling directions and a normalized depth map as input to the network. We conduct extensive experiments on a large scale dataset we build for evaluating temporal view synthesis in addition to the SceneNet RGB-D dataset. Our experiments demonstrate that our infilling vector prediction approach achieves superior quantitative and qualitative infilling performance compared to other approaches in literature. Our dataset and code can be found at https://nagabhushansn95.github.io/publications/2021/ivp.html

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — temporal view synthesis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Vijayalakshmi Kanchana , Nagabhushan Somraj , Suraj Yadwad , Rajiv Soundararajan

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Video Generation Deep Learning > Learning Types > Deep Learning

Keywords

video frame prediction depth estimation camera motion temporal view synthesis disocclusion handling image infilling

Download PDF

Related papers

A Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation 2022

Unsupervised Sounding Object Localization With Bottom-Up and Top-Down Attention 2022

Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation 2022

Deep Photo Scan: Semi-Supervised Learning for Dealing With the Real-World Degradation in Smartphone Photo Scanning 2022

Let There Be a Clock on the Beach: Reducing Object Hallucination in Image Captioning 2022