MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abdallah Dib; Luiz Gustavo Hafemann; Emeline Got; Trevor Anderson; Amin Fadaeinejad; Rafael M. O. Cruz; Marc-André Carbonneau

2024 CVPR CVPR 2024

MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abstract

Reconstructing an avatar from a portrait image has many applications in multimedia but remains a challenging research problem. Extracting reflectance maps and geometry from one image is ill-posed: recovering geometry is a one-to-many mapping problem and reflectance and light are difficult to disentangle. Accurate geometry and reflectance can be captured under the controlled conditions of a light stage but it is costly to acquire large datasets in this fashion. Moreover training solely with this type of data leads to poor generalization with in-the-wild images. This motivates the introduction of MoSAR a method for 3D avatar generation from monocular images. We propose a semi-supervised training scheme that improves generalization by learning from both light stage and in-the-wild datasets. This is achieved using a novel differentiable shading formulation. We show that our approach effectively disentangles the intrinsic face parameters producing relightable avatars. As a result MoSAR estimates a richer set of skin reflectance maps and generates more realistic avatars than existing state-of-the-art methods. We also release a new dataset that provides intrinsic face attributes (diffuse specular ambient occlusion and translucency maps) for 10k subjects.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — differentiable shading

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Abdallah Dib , Luiz Gustavo Hafemann , Emeline Got , Trevor Anderson , Amin Fadaeinejad , Rafael M. O. Cruz , Marc-André Carbonneau

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Semi-Supervised Learning Deep Learning > Models > Generative Models Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Machine Learning > Learning Paradigms > Semi-Supervised Learning

Keywords

semi-supervised learning 3d avatar monocular image relightable avatar reflectance map avatar reconstruction differentiable shading intrinsic face decomposition

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024