EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data

Zhonghua Yi; Hao Shi; Qi Jiang; Kailun Yang; Ze Wang; Diyang Gu; Yufan Zhang; Kaiwei Wang

2025 WACV WACV 2025

EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data

Abstract

Event cameras with high temporal resolution and high dynamic range have limited research on the inter-modality local feature extraction and matching of event-image data. We propose EI-Nexus an unmediated and flexible framework that integrates two modality-specific keypoint extractors and a feature matcher. To achieve keypoint extraction across viewpoint and modality changes we bring Local Feature Distillation (LFD) which transfers the viewpoint consistency from a well-learned image extractor to the event extractor ensuring robust feature correspondence. Furthermore with the help of Context Aggregation (CA) a remarkable enhancement is observed in feature matching. We further establish the first two inter-modality feature matching benchmarks MVSEC-RPE and EC-RPE to assess relative pose estimation on event-image data. Our approach outperforms traditional methods that rely on explicit modal transformation offering more unmediated and adaptable feature extraction and matching achieving better keypoint similarity and state-of-the-art results on the MVSEC-RPE and EC-RPE benchmarks. The source code and benchmarks will be made publicly available at EI-Nexus.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhonghua Yi , Hao Shi , Qi Jiang , Kailun Yang , Ze Wang , Diyang Gu , Yufan Zhang , Kaiwei Wang

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Core AI > Multimodal Learning Computer Vision > Analysis > Motion Estimation Artificial Intelligence > Core AI > Computer Vision

Keywords

pose estimation event camera feature matching multi-modal learning local feature local feature extraction

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025