DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein; Simon Giebenhain; Matthias Nießner

2024 CVPR CVPR 2024

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Abstract

DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person offering intuitive control over both pose and expression. We propose a diffusion-based neural renderer that leverages generic 2D priors to produce compelling images of faces. For coarse guidance of the expression and head pose we render a neural parametric head model (NPHM) from the target viewpoint which acts as a proxy geometry of the person. Additionally to enhance the modeling of intricate facial expressions we condition DiffusionAvatars directly on the expression codes obtained from NPHM via cross-attention. Finally to synthesize consistent surface details across different viewpoints and expressions we rig learnable spatial features to the head's surface via TriPlane lookup in NPHM's canonical space. We train DiffusionAvatars on RGB videos and corresponding fitted NPHM meshes of a person and test the obtained avatars in both self-reenactment and animation scenarios. Our experiments demonstrate that DiffusionAvatars generates temporally consistent and visually appealing videos for novel poses and expressions of a person outperforming existing approaches.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — neural parametric head model

🐣 Hot Topic Early Bird — facial animation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tobias Kirschstein , Simon Giebenhain , Matthias Nießner

Topics

Artificial Intelligence > Core AI > Procedural Generation Machine Learning > Core Methods > Representation Learning Deep Learning > Models > Diffusion Models Computer Vision > Generation > Image Generation Artificial Intelligence > Core AI > Computer Vision Computer Vision > Domain-Specific > Computer Graphics Computer Vision > Generation > 3D Generation

Keywords

facial animation neural rendering diffusion model novel view synthesis 3d avatar face reconstruction 3d head avatar head pose control neural parametric head model deferred diffusion triplane lookup parametric head model

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024