Self-Supervised Human Depth Estimation From Monocular Videos

Feitong Tan; Hao Zhu; Zhaopeng Cui; Siyu Zhu; Marc Pollefeys; Ping Tan

2020 CVPR CVPR 2020

Self-Supervised Human Depth Estimation From Monocular Videos

Abstract

Previous methods on estimating detailed human depth often require supervised training with 'ground truth' depth data. This paper presents a self-supervised method that can be trained on YouTube videos without known depth, which makes training data collection simple and improves the generalization of the learned network. The self-supervised learning is achieved by minimizing a photo-consistency loss, which is evaluated between a video frame and its neighboring frames warped according to the estimated depth and the 3D non-rigid motion of the human body. To solve this non-rigid motion, we first estimate a rough SMPL model at each video frame and compute the non-rigid body motion accordingly, which enables self-supervised learning on estimating the shape details. Experiments demonstrate that our method enjoys better generalization, and performs much better on data in the wild.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — photo-consistency loss

🐣 Hot Topic Early Bird — monocular video

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Feitong Tan , Hao Zhu , Zhaopeng Cui , Siyu Zhu , Marc Pollefeys , Ping Tan

Topics

Machine Learning > Learning Types > Self-Supervised Learning Computer Vision > Analysis > Depth Estimation Deep Learning > Learning Types > Self-Supervised Learning Computer Vision > Processing > Depth Estimation

Keywords

self-supervised learning depth estimation human pose estimation monocular video smpl model photo-consistency loss

Download PDF

Related papers

Deep Polarization Cues for Transparent Object Segmentation 2020

HRank: Filter Pruning Using High-Rank Feature Map 2020

Panoptic-Based Image Synthesis 2020

Select, Supplement and Focus for RGB-D Saliency Detection 2020

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings 2020