Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models

Ruofan Liang; Zan Gojcic; huan ling; Jacob Munkberg; Jon Hasselgren; Chih-Hao Lin; Jun Gao; Alexander Keller; Nandita Vijaykumar; Sanja Fidler; Zian Wang

2025 CVPR CVPR 2025

Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models

Abstract

Understanding and modeling lighting effects are fundamental tasks in computer vision and graphics. Classic physically-based rendering (PBR) accurately simulates the light transport, but relies on precise scene representations--explicit 3D geometry, high-quality material properties, and lighting conditions--that are often impractical to obtain in real-world scenarios. Therefore, we introduce Diffusion Renderer, a neural approach that addresses the dual problem of inverse and forward rendering within a holistic framework. Leveraging powerful video diffusion model priors, the inverse rendering model accurately estimates G-buffers from real-world videos, providing an interface for image editing tasks, and training data for the rendering model. Conversely, our rendering model generates photorealistic images from G-buffers without explicit light transport simulation. Specifically, we first train a video diffusion model for inverse rendering on synthetic data, which generalizes well to real-world videos and allows us to auto-label diverse real-world videos. We then co-train our rendering model using both synthetic and auto-labeled real-world data. Experiments demonstrate that Diffusion Renderer effectively approximates inverse and forwards rendering, consistently outperforming the state-of-the-art. Our model enables practical applications from a single video input--including relighting, material editing, and realistic object insertion.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — forward rendering

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

Authors

Ruofan Liang , Zan Gojcic , huan ling , Jacob Munkberg , Jon Hasselgren , Chih-Hao Lin , Jun Gao , Alexander Keller , Nandita Vijaykumar , Sanja Fidler , Zian Wang

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Analysis > 3D Vision Computer Vision > Processing > Image Processing Computer Vision > Generation > 3D Generation

Keywords

neural rendering video diffusion model inverse rendering forward rendering g-buffer estimation

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025