2026 WACV WACV 2026

GroupPortrait: Multi-ID Portrait Generation with High Identity Preservation and Fine-Grained Control

Abstract

Identity-preserving portrait generation has achieved tremendous advancements with the development of diffusion models. However, multi-ID generation remains challenging due to degraded identity fidelity and insufficient control over layout, pose, and expression. To address these challenges, we propose GroupPortrait, a novel approach for multi-ID portrait generation with three key innovations:(1) LatentID for high-fidelity identity preservation, (2) Facial Controller enabling layout guidance and fine-grained facial control, and (3) Mask-Attention Controller allocating identity embeddings to specific facial regions. First, the LatentID module improves identity preservation by adding LatentID loss during training. It maps latent representations to identity features and uses ID consistency loss for feedback training to improve identity retention. Since LatentID loss is calculated in latent space, it is more efficient in terms of time and GPU usage compared to the method that calculates ID loss in pixel space. Second, to enhance layout and facial controllability, the Facial Controller utilizes 3D Morphable Models (3DMM) to acquire facial shapes, poses, and expressions for each individual, imposing strong spatial conditions during the diffusion process. Finally, we propose a novel Mask-Attention Controller for multi-ID generation, which distributes ID embeddings into target facial regions by aligning the cross-attention map of LatentID with the given facial region masks. Extensive experiments demonstrate that GroupPortrait can generate group portraits with high fidelity, local harmony, and controllability.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning
🧭 Keyword Pioneer — facial control
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio