PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

Xu Peng; Junwei Zhu; Boyuan Jiang; Ying Tai; Donghao Luo; Jiangning Zhang; Wei LIN; Taisong Jin; Chengjie Wang; Rongrong Ji

2024 CVPR CVPR 2024

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

Abstract

Recent advancements in personalized image generation using diffusion models have been noteworthy. However existing methods suffer from inefficiencies due to the requirement for subject-specific fine-tuning. This computationally intensive process hinders efficient deployment limiting practical usability. Moreover these methods often grapple with identity distortion and limited expression diversity. In light of these challenges we propose PortraitBooth an innovative approach designed for high efficiency robust identity preservation and expression-editable text-to-image generation without the need for fine-tuning. PortraitBooth leverages subject embeddings from a face recognition model for personalized image generation without fine-tuning. It eliminates computational overhead and mitigates identity distortion. The introduced dynamic identity preservation strategy further ensures close resemblance to the original image identity. Moreover PortraitBooth incorporates emotion-aware cross-attention control for diverse facial expressions in generated images supporting text-driven expression editing. Its scalability enables efficient and high-quality image creation including multi-subject generation. Extensive results demonstrate superior performance over other state-of-the-art methods in both single and multiple image generation scenarios.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — cross-attention control

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xu Peng , Junwei Zhu , Boyuan Jiang , Ying Tai , Donghao Luo , Jiangning Zhang , Wei LIN , Taisong Jin , Chengjie Wang , Rongrong Ji

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Models > Diffusion Models Computer Vision > Generation > Image Generation Machine Learning > Application Areas > Model Compression Deep Learning > Learning Types > Generative Models

Keywords

image generation face recognition text-to-image generation diffusion model model fine-tuning identity preservation face generation cross-attention control portrait generation personalized image generation recognition model face recognition model

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024