2026 WACV WACV 2026

A Framework for Real-Time Surgical Phase Recognition with Application to Robot-Assisted Partial Nephrectomy

Abstract

Surgical practice has increasingly integrated advanced technologies to improve procedural outcomes, efficiency, and safety in modern operating rooms. Within this evolving landscape, Automated Surgical Phase Recognition (SPR) leverages Artificial Intelligence to temporally segment surgical workflows into key events, thereby supporting both real-time decision-making and off-line analysis. Despite the potential of SPR, previous research focused on short and linear surgeries, paying limited attention to the development, assessment, and deployment of real-time systems for complex surgical workflows. This work addresses these gaps by targeting the highly-complex and non linear workflow of Robot-Assisted Partial Nephrectomy (RAPN). We develop a real-time SPR system trained on 143 annotated RAPN surgical videos spanning 15 phases. The system incorporates a trainable canonical calibration error estimator combined with Viterbi decoding for more reliable outcomes. Additionally, we introduce a novel assessment framework designed to simultaneously evaluate off-line, real-time, and averaged SPR performance, synthesising historical phase predictions over time. For deployment, we implement the SPR pipeline as an end-to-end application using the NVIDIA Holoscan platform. The system was successfully tested during three live RAPN cases on human patients in a collaborating hospital, achieving an average inference latency of 16.65 ms and an accuracy of 68.2%. Results indicate that Viterbi decoding boosts performance in this complex surgery, while canonical calibration does not significantly increase overall performance but enhances classification reliability. We show the feasibility of deploying a real-time SPR pipeline for RAPN, which holds promise for optimising OR planning. The application is available at https://github.com/nvidia-holoscan/holohub/tree/main/applications/orsi

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio