FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation

Chade Li; Pengju Zhang; Bo Liu; Hao Wei; Yihong Wu

2025 AAAI AAAI 2025

FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation

Abstract

Abstract Point cloud segmentation has a wide range of applications in autonomous driving, augmented reality and virtual reality. Multi-modal fusion strategies have received increasing attention in point cloud segmentation recently. Despite the success, existing methods usually generate unnecessary information loss or redundancy. In this paper, we propose FEAST-Mamba, a novel FEAture and SpaTial aware Mamba network to tackle multi-modal point cloud segmentation. To exploit the complementarity between different modals, we propose a bidirectional orthogonal attention module, where features are first bidirectionally interacted with each other through cross-modal attention, and then orthogonal fusion is used to reduce feature redundancy. Furthermore, a reordering strategy is proposed for the Mamba architecture that takes into account both spatial and semantic information during cross-modal feature ordering. Experiments on indoor datasets, S3DIS and ScanNet, and outdoor datasets, nuScenes and SemanticKITTI, show that the proposed method achieves state-of-the-art performances.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning

🧭 Keyword Pioneer — mamba network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chade Li , Pengju Zhang , Bo Liu , Hao Wei , Yihong Wu

Topics

Artificial Intelligence > Core AI > Multimodal Learning Deep Learning > Architectures > Neural Networks Computer Vision > Analysis > 3D Vision Computer Vision > Domain-Specific > Autonomous Driving Computer Vision > Processing > Semantic Segmentation

Keywords

feature extraction attention mechanism autonomous driving 3d vision multi-modal fusion cross-modal attention point cloud segmentation spatial awareness mamba network

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025