Learning From Temporal Gradient for Semi-Supervised Action Recognition

Junfei Xiao; Longlong Jing; Lin Zhang; Ju He; Qi She; Zongwei Zhou; Alan Yuille; Yingwei Li

2022 CVPR CVPR 2022

Learning From Temporal Gradient for Semi-Supervised Action Recognition

Abstract

Semi-supervised video action recognition tends to enable deep neural networks to achieve remarkable performance even with very limited labeled data. However, existing methods are mainly transferred from current image-based methods (e.g., FixMatch). Without specifically utilizing the temporal dynamics and inherent multimodal attributes, their results could be suboptimal. To better leverage the encoded temporal information in videos, we introduce temporal gradient as an additional modality for more attentive feature extraction in this paper. To be specific, our method explicitly distills the fine-grained motion representations from temporal gradient (TG) and imposes consistency across different modalities (i.e., RGB and TG). The performance of semi-supervised action recognition is significantly improved without additional computation or parameters during inference. Our method achieves the state-of-the-art performance on three video action recognition benchmarks (i.e., Kinetics-400, UCF-101, and HMDB-51) under several typical semi-supervised settings (i.e., different ratios of labeled data).

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Junfei Xiao , Longlong Jing , Lin Zhang , Ju He , Qi She , Zongwei Zhou , Alan Yuille , Yingwei Li

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Machine Learning > Application Areas > Data Augmentation Computer Vision > Analysis > Action Recognition Machine Learning > Learning Types > Multi-Modal Learning Computer Vision > Analysis > Video Understanding Machine Learning > Learning Paradigms > Semi-Supervised Learning Deep Learning > Learning Types > Semi-Supervised Learning

Keywords

semi-supervised learning action recognition video understanding consistency regularization consistency training video action recognition temporal gradient

Download PDF

Related papers

UniCoRN: A Unified Conditional Image Repainting Network 2022

Why Discard if You Can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis 2022

All-in-One Image Restoration for Unknown Corruption 2022

Stability-Driven Contact Reconstruction From Monocular Color Images 2022

Forecasting Characteristic 3D Poses of Human Actions 2022