Towards Realistic Scene Generation with LiDAR Diffusion Models

Haoxi Ran; Vitor Guizilini; Yue Wang

2024 CVPR CVPR 2024

Towards Realistic Scene Generation with LiDAR Diffusion Models

Abstract

Diffusion models (DMs) excel in photo-realistic image synthesis but their adaptation to LiDAR scene generation poses a substantial hurdle. This is primarily because DMs operating in the point space struggle to preserve the curve-like patterns and 3D geometry of LiDAR scenes which consumes much of their representation power. In this paper we propose LiDAR Diffusion Models (LiDMs) to generate LiDAR-realistic scenes from a latent space tailored to capture the realism of LiDAR scenes by incorporating geometric priors into the learning pipeline. Our method targets three major desiderata: pattern realism geometry realism and object realism. Specifically we introduce curve-wise compression to simulate real-world LiDAR patterns point-wise coordinate supervision to learn scene geometry and patch-wise encoding for a full 3D object context. With these three core designs our method achieves competitive performance on unconditional LiDAR generation in 64-beam scenario and state of the art on conditional LiDAR generation while maintaining high efficiency compared to point-based DMs (up to 107xfaster). Furthermore by compressing LiDAR scenes into a latent space we enable the controllability of DMs with various conditions such as semantic maps camera views and text prompts. Our code and pretrained weights are available at https://github.com/hancyran/LiDAR-Diffusion.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — lidar scene generation

🐣 Hot Topic Early Bird — scene generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Haoxi Ran , Vitor Guizilini , Yue Wang

Topics

Machine Learning > Application Areas > Domain Adaptation Deep Learning > Models > Diffusion Models Computer Vision > Generation > Image Generation Computer Vision > Domain-Specific > Autonomous Driving Computer Vision > Generation > 3D Generation

Keywords

point cloud generation point cloud conditional generation geometric prior diffusion model latent space scene generation 3d geometry lidar processing lidar scene generation point cloud synthesis

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024