Unsupervised Photometric-Consistent Depth Estimation from Endoscopic Monocular Video

Shijie Li; Weijun Lin; Qingyuan Xiang; Yunbin Tu; Shitan Asu; Zheng Li

2025 AAAI AAAI 2025

Unsupervised Photometric-Consistent Depth Estimation from Endoscopic Monocular Video

Abstract

Abstract Recent advancements in unsupervised monocular depth estimation typically rely on an assumption that image photometry remains consistent across consecutive frames. However, this assumption often fails in endoscopic scenes due to: 1) local photometric inconsistency caused by specular reflections creating highlights; and 2) global photometric inconsistency resulting from the simultaneous movement of the light source and the camera. Since unsupervised depth estimation methods rely on appearance discrepancies between frames as a supervisory signal, these photometric inconsistencies inevitably deteriorate loss function calculation. In this paper, our goal is to obtain a strong and reliable supervisory signal for achieving photometric-consistent depth estimation. To this end, for local photometric inconsistency, we utilize the specular reflection model to introduce a Highlight Loss for handling the estimation of highlight regions. For global photometric inconsistency, we design a Photometric Match module, which utilizes the spotlight illumination model to derive an analytical expression, achieving photometric alignment across different frames. Unlike previous works that introduce additional optical flow or networks, our method is simpler and more efficient. Extensive experiments demonstrate our method achieves the state-of-the-art results on C3VD, SCARED and SERV-CT datasets.

🌉 Interdisciplinary Bridge — Computer Vision and Healthcare & Medicine and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shijie Li , Weijun Lin , Qingyuan Xiang , Yunbin Tu , Shitan Asu , Zheng Li

Topics

Machine Learning > Learning Types > Unsupervised Learning Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Depth Estimation Computer Vision > Domain-Specific > Medical Imaging Healthcare & Medicine > Clinical > Medical Imaging

Keywords

unsupervised learning depth estimation monocular depth estimation monocular video endoscopic video specular reflection photometric consistency

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025