2021 ICCV ICCV 2021

Harnessing the Conditioning Sensorium for Improved Image Translation

Abstract

Existing methods for multi-modal domain translation learn to embed the input images into a domain-invariant "content" space and a domain-specific "style" space from which novel images can be synthesized. Rather than learning to embed the RGB image from scratch we propose deriving our content representation from conditioning data produced by pretrained off-the-shelf networks. Motivated by the inherent ambiguity of "content", which has different meanings depending on the desired level of abstraction, this approach gives intuitive control over which aspects of content are preserved across domains. We evaluate our method on traditional, well-aligned, datasets such as CelebA-HQ, and propose two novel datasets for evaluation on more complex scenes: ClassicTV and FFHQ-WildCrops. Our approach, which we call Sensorium, enables higher quality domain translation for complex scenes than prior work.

🧭 Keyword Pioneer — style space
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio