Learning to Select Views for Efficient Multi-View Understanding

Yunzhong Hou; Stephen Gould; Liang Zheng

2024 CVPR CVPR 2024

Learning to Select Views for Efficient Multi-View Understanding

Abstract

Multiple camera view (multi-view) setups have proven useful in many computer vision applications. However the high computational cost associated with multiple views creates a significant challenge for end devices with limited computational resources. In modern CPU pipelining breaks a longer job into steps and enables parallelism over sequential steps from multiple jobs. Inspired by this we study selective view pipelining for efficient multi-view understanding which breaks computation of multiple views into steps and only computes the most helpful views/steps in a parallel manner for the best efficiency. To this end we use reinforcement learning to learn a very light view selection module that analyzes the target object or scenario from initial views and selects the next-best-view for recognition or detection for pipeline computation. Experimental results on multi-view classification and detection tasks show that our approach achieves promising performance while using only 2 or 3 out of N available views significantly reducing computational costs while maintaining parallelism over GPU through selective view pipelining.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — multi-view understanding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yunzhong Hou , Stephen Gould , Liang Zheng

Topics

Machine Learning > Application Areas > Efficient Computing Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Core AI > Computer Vision Computer Vision > Core AI > Efficient Computing Artificial Intelligence > Core AI > Reinforcement Learning Machine Learning > Learning Types > Multi-View Learning

Keywords

reinforcement learning computer vision efficient computing multi-view learning view selection multi-view understanding

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024