MODEC: Multimodal Decomposable Models for Human Pose Estimation

Ben Sapp; Ben Taskar

2013 CVPR CVPR 2013

MODEC: Multimodal Decomposable Models for Human Pose Estimation

Abstract

We propose a multimodal, decomposable model for articulated human pose estimation in monocular images. A typical approach to this problem is to use a linear structured model, which struggles to capture the wide range of appearance present in realistic, unconstrained images. In this paper, we instead propose a model of human pose that explicitly captures a variety of pose modes. Unlike other multimodal models, our approach includes both global and local pose cues and uses a convex objective and joint training for mode selection and pose estimation. We also employ a cascaded mode selection step which controls the trade-off between speed and accuracy, yielding a 5x speedup in inference and learning. Our model outperforms state-of-theart approaches across the accuracy-speed trade-off curve for several pose datasets. This includes our newly-collected dataset of people in movies, FLIC, which contains an order of magnitude more labeled data for training and testing than existing datasets. The new dataset and code are available online. 1

🚀 Conference Pioneer — CVPR 2013

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

🐣 Hot Topic Early Bird — multimodal learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ben Sapp , Ben Taskar

Topics

Artificial Intelligence > Core AI > Multimodal Learning Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Human Pose Estimation

Keywords

convex optimization multimodal learning 3d vision human pose estimation deformable part model

Download PDF

Related papers

Nonlinearly Constrained MRFs: Exploring the Intrinsic Dimensions of Higher-Order Cliques 2013

An Approach to Pose-Based Action Recognition 2013

Modeling Actions through State Changes 2013

A Convex Regularizer for Reducing Color Artifact in Color Image Recovery 2013

Deformable Spatial Pyramid Matching for Fast Dense Correspondences 2013