2013
CVPR
CVPR 2013
Representing Videos Using Mid-level Discriminative Patches
Abstract
representation for videos based on mid-level discriminative spatio-temporal patches. These spatio-temporal patches might correspond to a primitive human action, a semantic object, or perhaps a random but informative spatiotemporal patch in the video. What defines these spatiotemporal patches is their discriminative and representative properties. We automatically mine these patches from hundreds of training videos and experimentally demonstrate that these patches establish correspondence across videos and align the videos for label transfer techniques. Furthermore, these patches can be used as a discriminative vocabulary for action classification where they demonstrate stateof-the-art performance on UCF50 and Olympics datasets.
🚀
Conference Pioneer
— CVPR 2013
🌉
Interdisciplinary Bridge
— Computer Vision and Machine Learning
📈
Trend Setter
— Multimodal Learning
🧭
Keyword Pioneer
— discriminative patch
🐣
Hot Topic Early Bird
— video representation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Machine Learning > Learning Types > Weakly Supervised Learning
Computer Vision > Analysis > Action Recognition
Machine Learning > Core Methods > Feature Learning
Computer Vision > Core AI > Multimodal Learning
Computer Vision > Analysis > Video Understanding