MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang; Zexiang Xu; Desai Xie; Ziwen Chen; Haian Jin; Fujun Luan; Zhixin Shu; Kai Zhang; Sai Bi; Xin Sun; Jiuxiang Gu; Qixing Huang; Georgios Pavlakos; Hao Tan

2025 CVPR CVPR 2025

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Abstract

We propose scaling up 3D scene reconstruction by training with synthesized data. At the core of our work is MegaSynth, a procedurally generated 3D dataset comprising 700K scenes - over 50 times larger than the prior real dataset DL3DV - dramatically scaling the training data. To enable scalable data generation, our key idea is eliminating semantic information, removing the need to model complex semantic priors such as object affordances and scene composition. Instead, we model scenes with basic spatial structures and geometry primitives, offering scalability. Besides, we control data complexity to facilitate training while loosely aligning it with real-world data distribution to benefit real-world generalization. We explore training LRMs with both MegaSynth and available real data. Experiment results show that joint training or pre-training with MegaSynth improves reconstruction quality by 1.2 to 1.8 dB PSNR across diverse image domains. Moreover, models trained solely on MegaSynth perform comparably to those trained on real data, underscoring the low-level nature of 3D reconstruction. Additionally, we provide an in-depth analysis of MegaSynth's properties for enhancing model capability, training stability, and generalization.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — latent reconstruction model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hanwen Jiang , Zexiang Xu , Desai Xie , Ziwen Chen , Haian Jin , Fujun Luan , Zhixin Shu , Kai Zhang , Sai Bi , Xin Sun , Jiuxiang Gu , Qixing Huang , Georgios Pavlakos , Hao Tan

Topics

Deep Learning > Models > Generative Models Computer Vision > Analysis > 3D Vision Computer Vision > Processing > Image Restoration Machine Learning > Learning Paradigms > Transfer Learning Deep Learning > Learning Types > Deep Learning Computer Vision > Core AI > Computer Vision

Keywords

3d reconstruction scene reconstruction neural rendering procedural generation novel view synthesis large reconstruction model 3d scene reconstruction synthesized datum latent reconstruction model

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025