2025 AAAI AAAI 2025

MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection

Abstract

Abstract In the field of Moving Infrared Small Target Detection (MIRSTD), current methods typically use sequential modeling with two individual modules for spatial and temporal processing. However, such a modeling strategy lacks clear guidance on the motion and displacement difference between moving targets and background noise, thereby limiting the feature discriminability and resulting in error-prone target localization. This paper addresses this issue from clip and frame levels and proposes a novel architecture MOCID for MIRSTD. For clip-level feature fusion, we design a spatio-temporal backbone consisting of several proposed Fourier-inspired Spatio-temporal Attention (FISTA) layers. Each FISTA layer sequentially processes the features from spatial and temporal views to capture clip-level temporal motion context, where Fourier Transformation and Inverse Fourier Transformation are employed for each view. This context is then embedded into dynamic convolutional kernels for subsequent spatial feature extraction, thereby enabling clear motion difference guidance and generating comprehensive features. For frame-level feature fusion, we design a Displacement-aware Mamba Module (DAM) to capture detailed frame-to-frame displacement information. DAM utilizes an innovative Temporal Interpolation and Displacement-aware Scan technique to perform spatio-temporal difference-aware displacement modeling, introducing elaborate temporal indicators into feature extraction. Combining the above improvements, our model captures comprehensive motion and displacement contexts, significantly improving the detection of the small target. Extensive experiments demonstrate that MOCID achieves state-of-the-art detection accuracy on popular IRDST and DAUB datasets. Furthermore, MOCID offers a superior balance between throughput and performance compared to other methods. The code for this work will be made publicly available.

🧭 Keyword Pioneer — displacement modeling
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio