LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Gengshan Yang; Deqing Sun; Varun Jampani; Daniel Vlasic; Forrester Cole; Huiwen Chang; Deva Ramanan; William T. Freeman; Ce Liu

2021 CVPR CVPR 2021

LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Abstract

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. However, it is still challenging to reconstruct nonrigid structures from RGB inputs, due to the under-constrained nature of this problem. While template-based approaches, such as parametric shape models, have achieved great success in terms of modeling the "closed world" of known object categories, their ability to handle the "open-world" of novel object categories and outlier shapes is still limited. In this work, we introduce a template-free approach for 3D shape learning from a single video. It adopts an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixels intensities to compare against video observations, which generates gradients signals to adjust the camera, shape and motion parameters. Without relying on a category-specific shape template, our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes in the wild.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — monocular video

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gengshan Yang , Deqing Sun , Varun Jampani , Daniel Vlasic , Forrester Cole , Huiwen Chang , Deva Ramanan , William T. Freeman , Ce Liu

Topics

Machine Learning > Core Methods > Representation Learning Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Scene Understanding Deep Learning > Learning Types > Self-Supervised Learning Computer Vision > Generation > 3D Generation Computer Vision > Processing > 3D Vision

Keywords

3d reconstruction nonrigid structure non-rigid structure optical flow monocular video articulated shape shape learning

Download PDF

Related papers

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events 2021

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs 2021

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization 2021

Pose-Guided Human Animation From a Single Image in the Wild 2021