Concurrent Action Detection with Structural Prediction

Ping Wei; Nanning Zheng; Yibiao Zhao; Song-chun Zhu

2013 ICCV ICCV 2013

Concurrent Action Detection with Structural Prediction

Abstract

Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the same time, and different actions interact with each other. This paper proposes a concurrent action detection model where the action detection is formulated as a structural prediction problem. In this model, an interval in a video sequence can be described by multiple action labels. An detected action interval is determined both by the unary local detector and the relations with other actions. We use a wavelet feature to represent the action sequence, and design a composite temporal logic descriptor to describe the action relations. The model parameters are trained by structural SVM learning. Given a long video sequence, a sequential decision window search algorithm is designed to detect the actions. Experiments on our new collected concurrent action dataset demonstrate the strength of our method.

🚀 Conference Pioneer — ICCV 2013

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🧭 Keyword Pioneer — concurrent action detection

🐣 Hot Topic Early Bird — multi-label classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ping Wei , Nanning Zheng , Yibiao Zhao , Song-chun Zhu

Topics

Machine Learning > Core Methods > Classification Computer Vision > Analysis > Action Recognition Computer Vision > Analysis > Video Understanding

Keywords

action recognition multi-label classification structural svm temporal logic action detection concurrent action detection structural prediction concurrent action

Download PDF

Related papers

Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences 2013

Cascaded Shape Space Pruning for Robust Facial Landmark Detection 2013

Unsupervised Intrinsic Calibration from a Single Frame Using a "Plumb-Line" Approach 2013

Accurate and Robust 3D Facial Capture Using a Single RGBD Camera 2013

From Where and How to What We See 2013