Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior

Junshu Tang; Tengfei Wang; Bo Zhang; Ting Zhang; Ran Yi; Lizhuang Ma; Dong Chen

2023 ICCV ICCV 2023

Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior

Abstract

In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. This is inherently challenging: it essentially involves estimating the underlying 3D geometry while hallucinating unseen textures. To address this challenge, we leverage prior knowledge in a well-trained 2D diffusion model to serve as a 3D-aware supervision for 3D creation. Our proposed method, Make-It-3D, employs a two-stage optimization pipeline: the first stage optimizes a neural radiance field with constraints from the reference image and diffusion prior; the second stage builds textured point clouds from the coarse model and further enhances the textures with diffusion prior leveraging the availability of high-quality textures from the reference image. Extensive experiments show that our method achieves a clear improvement over previous works, displaying faithful reconstruction and impressive visual quality. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects, and enables various applications such as text-to-3D creation and texture editing.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Junshu Tang , Tengfei Wang , Bo Zhang , Ting Zhang , Ran Yi , Lizhuang Ma , Dong Chen

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Analysis > 3D Vision

Keywords

3d reconstruction diffusion model neural radiance field

Download PDF

Related papers

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework 2023

Periodically Exchange Teacher-Student for Source-Free Object Detection 2023

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations 2023

Minimal Solutions to Uncalibrated Two-view Geometry with Known Epipoles 2023

3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation 2023