Meta-Auxiliary Learning for Future Depth Prediction in Videos
Abstract
We consider a new problem of future depth prediction in video. Given a sequence of observed frames, the goal is to predict the depth map of a future frame that has not been observed yet. Depth estimation plays a vital role for scene understanding and decision-making in intelligent systems. Predicting future depth maps can be valuable for autonomous vehicles to anticipate the behaviors of their surrounding objects. Our proposed model for this problem has a two-branch architecture. One branch is for the primary task of future depth estimation. The other branch is for an auxiliary task of image reconstruction. The auxiliary branch can act as a regularization. Inspired by some recent work on test-time adaption, we use the auxiliary task during testing to adapt the model to a specific test video. We also propose a novel meta-auxiliary learning that learn the model specifically for the purpose of effective test-time adaptation. Experimental results demonstrate that our proposed approach significantly outperforms other alternative methods.