2025 ICCV ICCV 2025

AnnofreeOD: Detecting All Classes at Low Frame Rates Without Human Annotations

Abstract

Manual annotation of 3D bounding boxes in large-scale 3D scenes is expensive and time-consuming. This motivates the exploration of annotation-free 3D object detection using unlabeled point cloud data. Existing unsupervised 3D detection frameworks predominantly identify moving objects via scene flow, which has significant limitations: (1) limited detection classes (<=3), (2) difficulty in detecting stationary objects, and (3) reliance on high frame rates. To address these limitations, we propose AnnofreeOD, a novel Annotation-free Object Detection framework based on 2D-to-3D knowledge distillation. First, we explore an effective strategy to generate high-quality pseudo boxes using single-frame 2D knowledge. Second, we observe the noise from the previous step and introduce Noise-Resistant Regression (NRR) based on Box Augmentation (BA). AnnofreeOD achieves state-of-the-art performance across multiple experiments. On the nuScenes dataset, we established the first annotation-free 10-class object detection baseline, achieving 40% of fully supervised performance. Furthermore, in 3-class and class-agnostic object detection tasks, our approach surpasses prior state-of-the-art methods by +9.3% mAP (+12.2% NDS) and +6.0% AP (+4.2% NDS), significantly improving precision.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning
🧭 Keyword Pioneer — annotation-free detection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio