Pyramid Coding for Functional Scene Element Recognition in Video Scenes

Eran Swears; Anthony Hoogs; Kim Boyer

2013 ICCV ICCV 2013

Pyramid Coding for Functional Scene Element Recognition in Video Scenes

Abstract

Recognizing functional scene elemeeents in video scenes based on the behaviors of moving o bjects that interact with them is an emerging problem of interest. Existing approaches have a limited ability to chhharacterize elements such as cross-walks, intersections, anddd buildings that have low activity, are multi-modal, or haveee indirect evidence. Our approach recognizes the low activvvity and multi-model elements (crosswalks/intersections) by introducing a hierarchy of descriptive clusters to ffform a pyramid of codebooks that is sparse in the numbbber of clusters and dense in content. The incorporation ooof local behavioral context such as person-enter-building aaand vehicle-parking nearby enables the detection of elemennnts that do not have direct motion-based evidence, e.g. buuuildings. These two contributions significantly improveee scene element recognition when compared against thhhree state-of-the-art approaches. Results are shown on tyyypical ground level surveillance video and for the first time on the more complex Wide Area Motion Imagery.

🚀 Conference Pioneer — ICCV 2013

🧭 Keyword Pioneer — scene element recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Eran Swears , Anthony Hoogs , Kim Boyer

Topics

Computer Vision > Analysis > Scene Understanding Computer Vision > Processing > Video Understanding

Keywords

video analysis scene recognition codebook learning scene element recognition pyramid coding video scene behavioral context functional scene element

Download PDF

Related papers

Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences 2013

Cascaded Shape Space Pruning for Robust Facial Landmark Detection 2013

Unsupervised Intrinsic Calibration from a Single Frame Using a "Plumb-Line" Approach 2013

Accurate and Robust 3D Facial Capture Using a Single RGBD Camera 2013

From Where and How to What We See 2013