LT3SD: Latent Trees for 3D Scene Diffusion

Quan Meng; Lei Li; Matthias Nießner; Angela Dai

2025 CVPR CVPR 2025

LT3SD: Latent Trees for 3D Scene Diffusion

Abstract

We present LT3SD, a novel latent diffusion model for large-scale 3D scene generation. Recent advances in diffusion models have shown impressive results in 3D object generation, but are limited in spatial extent and quality when extended to 3D scenes. To generate complex and diverse 3D scene structures, we introduce a latent tree representation to effectively encode both lower-frequency geometry and higher-frequency detail in a coarse-to-fine hierarchy. We can then learn a generative diffusion process in this latent 3D scene space, modeling the latent components of a scene at each resolution level. To synthesize large-scale scenes with varying sizes, we train our diffusion model on scene patches and synthesize arbitrary-sized output 3D scenes through shared diffusion generation across multiple scene patches. Through extensive experiments, we demonstrate the efficacy and benefits of LT3SD for large-scale, high-quality unconditional 3D scene generation and for probabilistic completion for partial scene observations.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Deep Learning

🧭 Keyword Pioneer — generative diffusion process

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Quan Meng , Lei Li , Matthias Nießner , Angela Dai

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Computer Science > Applications > Computer Graphics Computer Vision > Generation > 3D Generation

Keywords

3d reconstruction scene understanding hierarchical representation generative model diffusion model latent diffusion latent diffusion model 3d scene generation scene synthesis scene completion generative diffusion generative diffusion process

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025