2025 ICCV ICCV 2025

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

Abstract

We present Stable Video Materials 3D (SViM3D), a framework to predict multi-view consistent physically based rendering (PBR) materials, given a single image. Recently, video diffusion models have been successfully used to reconstruct 3D objects from a single image efficiently. However, reflectance is still represented by simple material models or needs to be estimated in additional pipeline steps to enable relighting and controlled appearance edits. We extend a latent video diffusion model to output spatially-varying PBR parameters and surface normals jointly with each generated RGB view based on explicit camera control. This unique setup allows for direct relighting in a 2.5D setting, and for generating a 3D asset using our model as neural prior. We introduce various mechanisms to this pipeline that improve quality in this ill-posed setting. We show state-of-the-art relighting and novel view synthesis performance on multiple object-centric datasets. Our method generalizes to diverse image inputs, enabling the generation of relightable 3D assets useful in AR/VR, movies, games and other visual media.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — relightable 3d
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio