CLoCKDistill: Consistent Location and Context aware Knowledge Distillation for DETRs

Qizhen Lan; Qing Tian

2026 WACV WACV 2026

CLoCKDistill: Consistent Location and Context aware Knowledge Distillation for DETRs

Abstract

Object detection has advanced significantly with Detection Transformers (DETRs). However, these models are computationally demanding, posing challenges for deployment in resource-constrained environments (e.g., self-driving cars). Knowledge distillation (KD) is an effective compression method widely applied to CNN detectors, but its application to DETR models has been limited. Most KD methods for DETRs fail to distill transformer-specific global context. Also, they blindly believe in the teacher model, which can sometimes be misleading. To bridge the gaps, this paper proposes Consistent Location-and-Context-aware Knowledge Distillation (CLoCKDistill) for DETR detectors, which includes both feature distillation and logit distillation components. For feature distillation, instead of distilling backbone features like many existing KD methods, we distill the transformer encoder output (i.e., memory) that contains valuable global context and long-range dependencies. Also, we enrich this memory with object location details during feature distillation so that the student model can prioritize relevant regions. To facilitate logit distillation, we create target-aware queries based on the ground truth, allowing both the student and teacher decoders to attend to consistent and accurate parts of encoder memory. Experiments on KITTI and COCO show our CLoCKDistill method's efficacy across various DETRs. Our method boosts student detector performance by 2.2% to 6.4%. Our code is available at: https://github.com/lanqz7766/CLoCKDistill.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio