2026 AAAI AAAI 2026

IPDA:Intelligent Perception Delay Alignment Method Based on Spatio-Temporal Co-Sensing Calibration

Abstract

Abstract Intelligent perception among multiple agents enables them to extend their individual observation capabilities by sharing sensory information, thereby improving the completeness and accuracy of environmental understanding. However, real-world communication is often subject to non-negligible delays, which can degrade the effectiveness of perception. To mitigate this, delay alignment is commonly employed to synchronize delayed observations to a common timestamp. Yet, both alignment errors and inherent discrepancies between multi-view observations can lead to inconsistencies in the estimated position and orientation of shared targets. These inconsistencies can accumulate during feature fusion, ultimately reducing the accuracy and reliability of the perception results.To address this challenge, we propose IPDA, a delay-aware multi-agent intelligent perception method that performs joint calibration in both temporal and spatial domains. In the temporal dimension, we design a historical alignment attention mechanism to model dynamic delay correction across sequences, ensuring temporal coherence. In the spatial dimension, we introduce a discrepancy-quantized co-sensing network that captures and compensates for multi-view spatial deviations caused by viewpoint diversity and alignment inaccuracy. IPDA is evaluated on two large-scale intelligent perception benchmarks, DAIR-V2X and OPV2V. Experimental results demonstrate that our method effectively mitigates delay-induced inconsistencies and consistently outperforms state-of-the-art baselines under various delay conditions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning
🧭 Keyword Pioneer — delay alignment
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio