Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance

Dimitrios Gerogiannis; Foivos Paraperas Papantoniou; Rolandos Alexandros Potamias; Alexandros Lattas; Stefanos Zafeiriou

2025 CVPR CVPR 2025

Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance

Abstract

Inspired by the effectiveness of 3D Gaussian Splatting (3DGS) in reconstructing detailed 3D scenes within multi-view setups and the emergence of large 2D human foundation models, we introduce Arc2Avatar, the first SDS-based method utilizing a human face foundation model as guidance with just a single image as input. To achieve that, we extend such a model for diverse-view human head generation by fine-tuning on synthetic data and modifying its conditioning. Our avatars maintain a dense correspondence with a human face mesh template, allowing blendshape-based expression generation. This is achieved through a modified 3DGS approach, connectivity regularizers, and a strategic initialization tailored for our task. Additionally, we propose an optional efficient SDS-based correction step to refine the blendshape expressions, enhancing realism and diversity. Experiments demonstrate that Arc2Avatar achieves state-of-the-art realism and identity preservation, effectively addressing color issues by allowing the use of very low guidance, enabled by our strong identity prior and initialization strategy, without compromising detail.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dimitrios Gerogiannis , Foivos Paraperas Papantoniou , Rolandos Alexandros Potamias , Alexandros Lattas , Stefanos Zafeiriou

Topics

Deep Learning > Models > Diffusion Models Deep Learning > Models > Generative Models Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Computer Vision > Generation > 3D Generation Computer Vision > Domain-Specific > 3D Vision

Keywords

image generation foundation model 3d gaussian splatting score distillation sampling identity preservation single image reconstruction face reconstruction single image avatar generation

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025