Physics-Based Human Motion Estimation and Synthesis From Videos

Kevin Xie; Tingwu Wang; Umar Iqbal; Yunrong Guo; Sanja Fidler; Florian Shkurti

2021 ICCV ICCV 2021

Physics-Based Human Motion Estimation and Synthesis From Videos

Abstract

Human motion synthesis is an important problem for applications in graphics and gaming, and even in simulation environments for robotics. Existing methods require accurate motion capture data for training, which is costly to obtain. Instead, we propose a framework for training generative models of physically plausible human motion directly from monocular RGB videos, which are much more widely available. At the core of our method is a novel optimization formulation that aims to correct imperfect image-based pose estimations by enforcing physics constraints and reasons about contacts in a differentiable way. This optimization yields corrected 3D poses and motions, as well as their corresponding contact forces. Results show that our physically-correct motions significantly outperform prior work on pose estimation. We then train a generative model to synthesize both future motion and contact forces. We demonstrate both qualitatively and quantitatively significantly improved motion synthesis quality and physical plausibility achieved by our method on the large scale Human3.6m dataset as compared to prior learning-based kinematic and physics-based methods. By learning directly from video, our method paves the way for large-scale, realistic and diverse motion synthesis not previously possible.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Machine Learning and Robotics

🐣 Hot Topic Early Bird — monocular video

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kevin Xie , Tingwu Wang , Umar Iqbal , Yunrong Guo , Sanja Fidler , Florian Shkurti

Topics

Machine Learning > Learning Types > Self-Supervised Learning Computer Vision > Generation > Video Generation Robotics > Capabilities > Motion Planning Computer Vision > Analysis > Motion Analysis Computer Science > Applications > Computer Graphics

Keywords

pose estimation human motion synthesis generative model monocular video physics-based simulation contact force physics-based optimization

Download PDF

Related papers

Spatial-Temporal Transformer for Dynamic Scene Graph Generation 2021

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 2021

A Broad Study on the Transferability of Visual Representations With Contrastive Learning 2021

Query Adaptive Few-Shot Object Detection With Heterogeneous Graph Convolutional Networks 2021

Self-Supervised Neural Networks for Spectral Snapshot Compressive Imaging 2021