2026 WACV WACV 2026

SAIL: Self-supervised Learning of Lighting-Invariant Representations from Real Images with Latent Diffusion

Abstract

Intrinsic image decomposition aims at separating an image into its underlying albedo and shading components, isolating the base color from lighting effects to enable downstream applications such as virtual relighting and scene editing.Despite the rise and success of learning-based approaches, intrinsic image decomposition from real-world images remains a significantly challenging task due to the scarcity of labeled ground-truth data.Most existing solutions rely on synthetic data as supervised setups, limiting their ability to generalize to real-world scenes. Self-supervised methods, on the other hand, often produce albedo-like maps that contain reflections and lack consistency under different lighting conditions.To address this, we propose SAIL, an approach designed to estimate illumination-invariant representations from single-view real-world images to specifically target plausible relighting. We repurpose the prior knowledge of a latent diffusion model for unconditioned scene relighting as a surrogate objective for learning light-invariant estimates. To achieve this, we introduce a novel intrinsic image decomposition fully formulated in the latent space.To guide the training of our latent diffusion model, we introduce regularization terms that constrain both the lighting-dependent and -independent components of our latent image decomposition.Through our experiments, we demonstrate that SAIL produces stable albedo-like representations under varying lighting conditions and generalizes to multiple scenes, using only unlabeled multi-illumination data available online.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio