Dynamic Zoom-In Network for Fast Object Detection in Large Images

Mingfei Gao; Ruichi Yu; Ang Li; Vlad I. Morariu; Larry S. Davis

2018 CVPR CVPR 2018

Dynamic Zoom-In Network for Fast Object Detection in Large Images

Abstract

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Reinforcement Learning

🐣 Hot Topic Early Bird — computational efficiency

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mingfei Gao , Ruichi Yu , Ang Li , Vlad I. Morariu , Larry S. Davis

Topics

Computer Vision > Analysis > Object Detection Reinforcement Learning > Methods > Deep RL Deep Learning > Learning Types > Reinforcement Learning Deep Learning > Optimization & Theory > Efficient Computing

Keywords

reinforcement learning object detection computational efficiency region proposal image resolution image pyramid region selection

Download PDF

Related papers

Multi-Shot Pedestrian Re-Identification via Sequential Decision Making 2018

Multi-Cue Correlation Filters for Robust Visual Tracking 2018

Pointwise Convolutional Neural Networks 2018

Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking 2018

Image Generation From Scene Graphs 2018