2026 WACV WACV 2026

Revisiting Retentive Networks for Fast Range-View 3D LiDAR Semantic Segmentation

Abstract

LiDAR semantic segmentation is a crucial task in autonomous driving and robotics, where real-time performance is essential for online decision-making. Recent trends exploit range images and Vision Transformers, using the self-attention mechanism. However, these approaches often lack explicit spatial priors and involve a large number of parameters. To tackle these limitations, we propose a novel method, adapting the Retentive Network architecture from the Natural Language Processing (NLP) field, for its efficient sequence modeling capabilities, directly operating on the range-view representation. Our approach incorporates a circular retention (CiR) mechanism that explicitly captures spatial relationships and continual circular property of the range image while modeling long-range dependencies and preserving the receptive field. In addition, we introduce a new set of range-view augmentations, adapted from 3D techniques, to improve generalization and mitigate class imbalance. Extensive experiments on three large-scale datasets, as SemanticKITTI, PandaSet and SemanticPOSS demonstrate that our method achieve state-of-the-art performance among range-view approaches on two out of three datasets, while satisfying real-time constraints. The code is available at https://github.com/SiMoM0/RangeRet.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio