Attentional Separation-and-Aggregation Network for Self-supervised Depth-Pose Learning in Dynamic Scenes

Feng Gao; Jincheng Yu; Hao Shen; Yu Wang; Huazhong Yang

2020 CORL CoRL 2020

Attentional Separation-and-Aggregation Network for Self-supervised Depth-Pose Learning in Dynamic Scenes

Abstract

Learning depth and ego-motion from unlabeled videos via self-supervision from epipolar projection can improve the robustness and accuracy of the 3D perception and localization of vision-based robots. However, the rigid projection computed by ego-motion cannot represent all scene points, such as points on moving objects, leading to false guidance in these regions. To address this problem, we propose an Attentional Separation-and-Aggregation Network (ASANet), which can learn to distinguish and extract the scene’s static and dynamic characteristics via the attention mechanism. We further propose a novel MotionNet with an ASANet as the encoder, followed by two separate decoders, to estimate the camera’s ego-motion and the scene’s dynamic motion field. Then, we introduce an auto-selecting approach to detect the moving objects for dynamic-aware learning automatically. Empirical experiments demonstrate that our method can achieve the state-of-the-art performance on the KITTI benchmark.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — epipolar projection

🐣 Hot Topic Early Bird — dynamic scene

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Feng Gao , Jincheng Yu , Hao Shen , Yu Wang , Huazhong Yang

Topics

Artificial Intelligence > Core AI > Autonomous Vehicles Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Techniques > Model Architecture

Keywords

self-supervised learning attention mechanism depth estimation dynamic scene epipolar projection

Download PDF

Related papers

Augmenting GAIL with BC for sample efficient imitation learning 2020

Neuro-Symbolic Program Search for Autonomous Driving Decision Module Design 2020

LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion 2020

DROGON: A Trajectory Prediction Model based on Intention-Conditioned Behavior Reasoning 2020

CAMPs: Learning Context-Specific Abstractions for Efficient Planning in Factored MDPs 2020