2026 WACV WACV 2026

SphereEdit: Spherical Semantic Editing in Diffusion Models

Abstract

Despite significant advances in diffusion models, achieving precise and composable image editing without task-specific training remains a challenge. Existing approaches often rely on iterative optimization or linear latent operations, which are slow, brittle, and prone to attribute entanglement (e.g., editing "lipstick" inadvertently alters skin tone). We introduce SphereEdit, a training-free framework that leverages the spherical geometry of diffusion embeddings and token aware cross-attention to enable interpretable, fine-grained control. We represent semantic attributes as unit vector directions in the denoiser's prediction space and show that antipodal symmetry ("old" is approximately the negation of "young") naturally supports bidirectional edits, while approximate orthogonality enables clean composition through spherical coefficient. At inference, these directions modulate cross-attention activations, producing spatially localized edits without optimization or fine-tuning. SphereEdit achieves sharper, more disentangled edits than prior baselines, while remaining plug-and-play and applicable across diverse image editing tasks.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — diffusion model editing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio