Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

Shengqu Cai; Duygu Ceylan; Matheus Gadelha; Chun-Hao Paul Huang; Tuanfeng Yang Wang; Gordon Wetzstein

2024 CVPR CVPR 2024

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

Abstract

Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry appearance motion and camera path. Creating computer-generated videos however is a tedious manual process which can be automated by emerging text-to-video diffusion models. Despite great promise video diffusion models are difficult to control hindering users to apply their creativity rather than amplifying it. To address this challenge we present a novel approach that combines the controllability of dynamic 3D meshes with the expressivity and editability of emerging diffusion models. For this purpose our approach takes an animated low-fidelity rendered mesh as input and injects the ground truth correspondence information obtained from the dynamic mesh into various stages of a pre-trained text-to-image generation model to output high-quality and temporally consistent frames. We demonstrate our approach on various examples where motion can be obtained by animating rigged assets or changing the camera path.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Deep Learning

🧭 Keyword Pioneer — 4d-guided generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shengqu Cai , Duygu Ceylan , Matheus Gadelha , Chun-Hao Paul Huang , Tuanfeng Yang Wang , Gordon Wetzstein

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Generation > Video Generation Computer Science > Applications > Computer Graphics

Keywords

video generation diffusion model controllable generation 3d mesh temporal consistency 4d-guided generation

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024