TexVocab: Texture Vocabulary-conditioned Human Avatars

Yuxiao Liu; Zhe Li; Yebin Liu; Haoqian Wang

2024 CVPR CVPR 2024

TexVocab: Texture Vocabulary-conditioned Human Avatars

Abstract

To adequately utilize the available image evidence in multi-view video-based avatar modeling we propose TexVocab a novel avatar representation that constructs a texture vocabulary and associates body poses with texture maps for animation. Given multi-view RGB videos our method initially back-projects all the available images in the training videos to the posed SMPL surface producing texture maps in the SMPL UV domain. Then we construct pairs of human poses and texture maps to establish a texture vocabulary for encoding dynamic human appearances under various poses. Unlike the commonly used joint-wise manner we further design a body-part-wise encoding strategy to learn the structural effects of the kinematic chain. Given a driving pose we query the pose feature hierarchically by decomposing the pose vector into several body parts and interpolating the texture features for synthesizing fine-grained human dynamics. Overall our method is able to create animatable human avatars with detailed and dynamic appearances from RGB videos and the experiments show that our method outperforms state-of-the-art approaches.

🌉 Interdisciplinary Bridge — Computer Science and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yuxiao Liu , Zhe Li , Yebin Liu , Haoqian Wang

Topics

Machine Learning > Core Methods > Representation Learning Computer Science > Systems > Computer Graphics

Keywords

pose estimation 3d modeling multi-view video texture mapping human avatar

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024