Do not Lose the Details: Reinforced Representation Learning for High Performance Visual Tracking

Qiang Wang; Mengdan Zhang; Junliang Xing; Jin Gao; Weiming Hu; Steve Maybank

2018 IJCAI IJCAI 2018

Do not Lose the Details: Reinforced Representation Learning for High Performance Visual Tracking

Abstract

This work presents a novel end-to-end trainable CNN model for high performance visual object tracking. It learns both low-level fine-grained representations and a high-level semantic embedding space in a mutual reinforced way, and a multi-task learning strategy is proposed to perform the correlation analysis on representations from both levels. In particular, a fully convolutional encoder-decoder network is designed to reconstruct the original visual features from the semantic projections to preserve all the geometric information. Moreover, the correlation filter layer working on the fine-grained representations leverages a global context constraint for accurate object appearance modeling. The correlation filter in this layer is updated online efficiently without network fine-tuning. Therefore, the proposed tracker benefits from two complementary effects: the adaptability of the fine-grained correlation analysis and the generalization capability of the semantic embedding. Extensive experimental evaluations on four popular benchmarks demonstrate its state-of-the-art performance.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🐣 Hot Topic Early Bird — semantic embedding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Qiang Wang , Mengdan Zhang , Junliang Xing , Jin Gao , Weiming Hu , Steve Maybank

Topics

Machine Learning > Core Methods > Representation Learning Computer Vision > Analysis > Object Tracking

Keywords

representation learning multi-task learning visual tracking semantic embedding encoder-decoder network correlation filter

Download PDF

Related papers

Semi-Supervised Multi-Modal Learning with Incomplete Modalities 2018

High-dimensional Similarity Learning via Dual-sparse Random Projection 2018

FISH-MML: Fisher-HSIC Multi-View Metric Learning 2018

Generative Warfare Nets: Ensemble via Adversaries and Collaborators 2018

Semi-Supervised Optimal Margin Distribution Machines 2018