3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Songchun Zhang; Yibo Zhang; Quan Zheng; Rui Ma; Wei Hua; Hujun Bao; Weiwei Xu; Changqing Zou

2024 CVPR CVPR 2024

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Abstract

Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image warping and inpainting to generate 3D scenes. However these methods heavily rely on the outputs of existing models leading to error accumulation in geometry and appearance that prevent the models from being used in various scenarios (e.g. outdoor and unreal scenarios). To address this limitation we generatively refine the newly generated local views by querying and aggregating global 3D information and then progressively generate the 3D scene. Specifically we employ a tri-plane features-based NeRF as a unified representation of the 3D scene to constrain global 3D consistency and propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior from 2D diffusion model as well as the global 3D information of the current scene. Our extensive experiments demonstrate that in comparison to previous methods our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

📈 Trend Setter — 3D Vision

🧭 Keyword Pioneer — generative refinement

🐣 Hot Topic Early Bird — 3d scene generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Songchun Zhang , Yibo Zhang , Quan Zheng , Rui Ma , Wei Hua , Hujun Bao , Weiwei Xu , Changqing Zou

Topics

Deep Learning > Models > Generative Models Computer Vision > Generation > 3D Generation Computer Vision > Generation > 3D Vision

Keywords

diffusion model neural radiance field 3d scene generation 3d consistency text-driven generation generative refinement text-driven 3d generation

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024