MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications

Gasser Elazab; Torben Gräber; Michael Unterreiner; Olaf Hellwich

2025 WACV WACV 2025

MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications

Abstract

Self-supervised monocular depth estimation (MDE) has gained popularity for obtaining depth predictions directly from videos. However these methods often produce scale-invariant results unless additional training signals are provided. Addressing this challenge we introduce a novel self-supervised metric-scaled MDE model that requires only monocular video data and the camera's mounting position both of which are readily available in modern vehicles. Our approach leverages planar-parallax geometry to reconstruct scene structure. The full pipeline consists of three main networks a multi-frame network a single-frame network and a pose network. The multi-frame network processes sequential frames to estimate the structure of the static scene using planar-parallax geometry and the camera mounting position. Based on this reconstruction it acts as a teacher distilling knowledge such as scale information masked drivable area metric-scale depth for the static scene and dynamic object mask to the single-frame network. It also aids the pose network in predicting a metric-scaled relative pose between two subsequent images. Our method achieved state-of-the-art results for the driving benchmark KITTI for metric-scaled depth prediction. Notably it is one of the first methods to produce self-supervised metric-scaled depth prediction for the challenging Cityscapes dataset demonstrating its effectiveness and versatility. Project page: https://mono-pp.github.io/

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — metric-scaled depth

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gasser Elazab , Torben Gräber , Michael Unterreiner , Olaf Hellwich

Topics

Artificial Intelligence > Core AI > Autonomous Vehicles Machine Learning > Learning Types > Self-Supervised Learning Computer Vision > Analysis > Depth Estimation Computer Vision > Domain-Specific > Autonomous Driving Deep Learning > Learning Types > Self-Supervised Learning

Keywords

self-supervised learning autonomous driving depth estimation monocular depth estimation monocular depth depth prediction metric-scaled depth planar-parallax geometry metric-scale depth

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025