G3DR: Generative 3D Reconstruction in ImageNet

Pradyumna Reddy; Ismail Elezi; Jiankang Deng

2024 CVPR CVPR 2024

G3DR: Generative 3D Reconstruction in ImageNet

Abstract

We introduce a novel 3D generative method Generative 3D Reconstruction (G3DR) in ImageNet capable of generating diverse and high-quality 3D objects from single images addressing the limitations of existing methods. At the heart of our framework is a novel depth regularization technique that enables the generation of scenes with high-geometric fidelity. G3DR also leverages a pretrained language-vision model such as CLIP to enable reconstruction in novel views and improve the visual realism of generations. Additionally G3DR designs a simple but effective sampling procedure to further improve the quality of generations. G3DR offers diverse and efficient 3D asset generation based on class or text conditioning. Despite its simplicity G3DR is able to beat state-of-theart methods improving over them by up to 22% in perceptual metrics and 90% in geometry scores while needing only half of the training time. Code is available at https://github.com/preddy5/G3DR

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — depth regularization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Pradyumna Reddy , Ismail Elezi , Jiankang Deng

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Deep Learning > Learning Types > Generative Models Computer Vision > Generation > 3D Generation Computer Vision > Processing > 3D Vision

Keywords

3d reconstruction image generation generative model novel view synthesis perceptual metric depth regularization

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024