Learning to Exploit Stability for 3D Scene Parsing

Yilun Du; Zhijian Liu; Hector Basevi; Ales Leonardis; Bill Freeman; Josh Tenenbaum; Jiajun Wu

2018 NIPS NeurIPS 2018

Learning to Exploit Stability for 3D Scene Parsing

Abstract

Human scene understanding uses a variety of visual and non-visual cues to perform inference on object types, poses, and relations. Physics is a rich and universal cue which we exploit to enhance scene understanding. We integrate the physical cue of stability into the learning process using a REINFORCE approach coupled to a physics engine, and apply this to the problem of producing the 3D bounding boxes and poses of objects in a scene. We first show that applying physics supervision to an existing scene understanding model increases performance, produces more stable predictions, and allows training to an equivalent performance level with fewer annotated training examples. We then present a novel architecture for 3D scene parsing named Prim R-CNN, learning to predict bounding boxes as well as their 3D size, translation, and rotation. With physics supervision, Prim R-CNN outperforms existing scene understanding approaches on this problem. Finally, we show that applying physics supervision on unlabeled real images improves real domain transfer of models training on synthetic data.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

🧭 Keyword Pioneer — physics supervision

🐣 Hot Topic Early Bird — bounding box

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yilun Du , Zhijian Liu , Hector Basevi , Ales Leonardis , Bill Freeman , Josh Tenenbaum , Jiajun Wu

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Scene Understanding Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

reinforcement learning semantic segmentation pose estimation scene understanding bounding box bounding box prediction object pose estimation 3d bounding box 3d scene parsing physics supervision

Download PDF

Related papers

Maximum Causal Tsallis Entropy Imitation Learning 2018

Recurrent World Models Facilitate Policy Evolution 2018

Bandit Learning in Concave N-Person Games 2018

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors 2018