Diffusion Time-step Curriculum for One Image to 3D Generation

Xuanyu Yi; Zike Wu; Qingshan Xu; Pan Zhou; Joo-Hwee Lim; Hanwang Zhang

2024 CVPR CVPR 2024

Diffusion Time-step Curriculum for One Image to 3D Generation

Abstract

Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pre-trained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the student-teacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123) which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4 RealFusion15 GSO and Level50 benchmark demonstrate that DTC123 can produce multi-view consistent high-quality and diverse 3D assets. Codes and more generation demos will be released in https://github.com/yxymessi/DTC123.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — time-step curriculum

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xuanyu Yi , Zike Wu , Qingshan Xu , Pan Zhou , Joo-Hwee Lim , Hanwang Zhang

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Computer Vision > Processing > Image Restoration Computer Vision > Generation > 3D Generation

Keywords

diffusion model 3d generation neural radiance field score distillation sampling multi-view consistent time-step curriculum

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024