2026 WACV WACV 2026

High-Level Semantics and Low-Level Features Fusion for Multi-Scale Object Detection in Dynamic Construction Environments

Abstract

Object detection in dynamic construction environments presents significant challenges due to vast scale variations, occlusions, and clutter. Conventional deep learning models struggle to balance the semantic information needed for classification with the spatial detail required for localization. This paper introduces a novel framework that systematically fuses features from different network depths to resolve this trade-off. Our primary contribution is a Hierarchical Feature Adjustment architecture that employs a coarse-to-fine strategy, progressively adjusting detections. We enhance robustness with an Efficient RoI Aggregation module for contextual aggregation and improve localization with a Modified IoU loss. Furthermore, a proposed Overlap Discriminating Module aids non-maximum suppression in dense scenes. Extensive experiments on the SODA, COD, and Small Tools datasets show our integrated approach significantly outperforms state-of-the-art methods, establishing a new benchmark for this critical application. We also demonstrate the model's generalizability to other domains with strong results on the BDD100K and COCO datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision
🧭 Keyword Pioneer — construction safety
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio