2025 WACV WACV 2025

ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model

Abstract

Diffusion models have achieved great success in dual-conditioned image generation. However they still face significant challenges in image-guided sketch colorization where reference and sketch images usually exhibit different spatial structures and semantics. This mismatch termed "distribution shift" in this paper results in various artifacts and degrades the colorization quality. To address this issue we conducted thorough investigations into the image-prompted latent diffusion model and developed a two-stage training framework to mitigate the effects of distribution shift based on our analysis. Comprehensive quantitative comparisons qualitative evaluations and user studies were performed to demonstrate the superiority of our proposed methods. Additionally an ablation study was conducted to assess the impact of the distribution shift and the selection of reference embeddings. Codes are made publicly available at https://github.com/tellurionkanata/colorizeDiffusion.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio