2026 WACV WACV 2026

An improved architecture for part-based animal re-identification through semantic segmentation distillation

Abstract

Wildlife re-identification (Re-ID) is critical for non-invasive monitoring. Yet, animal Re-ID performances remain far behind person Re-ID due to limited datasets and a greater fine-grained appearance variability between individuals. One strategy is to adopt part-based methods in order to more precisely attend to distinct anatomical regions. To adapt to animal Re-ID, we propose PAW-ViT (Part-AWare animal re-identification Vision Transformer), a ViT that replaces the standard classification token with K learnable part tokens, each specialized to a specific anatomical region of the animal. Spatial specialization is achieved via feature-based knowledge distillation by training each token's attention to image patches to produce a semantic segmentation mask. An additional aggregation token fuses the part embeddings into a single part-aware descriptor. Trained with a multi-task loss, PAW-ViT outperforms state-of-the-art methods in animal Re-ID on ATRW (Amur tigers) and YakREID-103 (yaks), particularly in scenarios of strong viewpoint variations like the cross-camera setting.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning
🧭 Keyword Pioneer — feature-based knowledge distillation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio