Learning Video Representations From Correspondence Proposals

Xingyu Liu; Joon-Young Lee; Hailin Jin

2019 CVPR CVPR 2019

Learning Video Representations From Correspondence Proposals

Abstract

Correspondences between frames encode rich information about dynamic content in videos. However, it is challenging to effectively capture and learn those due to their irregular structure and complex dynamics. In this paper, we propose a novel neural network that learns video representations by aggregating information from potential correspondences. This network, named CPNet, can learn evolving 2D fields with temporal consistency. In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input. We provide extensive ablation experiments to validate our model. CPNet shows stronger performance than existing methods on Kinetics and achieves the state-of-the-art performance on Something-Something and Jester. We provide analysis towards the behavior of our model and show its robustness to errors in proposals.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — correspondence proposal

🐣 Hot Topic Early Bird — temporal consistency

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xingyu Liu , Joon-Young Lee , Hailin Jin

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Neural Networks Computer Vision > Processing > Video Understanding Deep Learning > Learning Types > Representation Learning

Keywords

motion analysis temporal consistency video representation appearance feature motion representation video representation learning neural network motion feature correspondence proposal

Download PDF

Related papers

Fast Single Image Reflection Suppression via Convex Optimization 2019

ATOM: Accurate Tracking by Overlap Maximization 2019

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters 2019

Edge-Labeling Graph Neural Network for Few-Shot Learning 2019

Hardness-Aware Deep Metric Learning 2019