Video Instance Segmentation With a Propose-Reduce Paradigm

Huaijia Lin; Ruizheng Wu; Shu Liu; Jiangbo Lu; Jiaya Jia

2021 ICCV ICCV 2021

Video Instance Segmentation With a Propose-Reduce Paradigm

Abstract

Video instance segmentation (VIS) aims to segment and associate all instances of predefined classes for each frame in videos. Prior methods usually obtain segmentation for a frame or clip first, and merge the incomplete results by tracking or matching. These methods may cause error accumulation in the merging step. Contrarily, we propose a new paradigm -- Propose-Reduce, to generate complete sequences for input videos by a single step. We further build a sequence propagation head on the existing image-level instance segmentation network for long-term propagation. To ensure robustness and high recall of our proposed framework, multiple sequences are proposed where redundant sequences of the same instance are reduced. We achieve state-of-the-art performance on two representative benchmark datasets -- we obtain 47.6% in terms of AP on YouTube-VIS validation set and 70.4% for J&F on DAVIS-UVOS validation set.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — sequence propagation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Huaijia Lin , Ruizheng Wu , Shu Liu , Jiangbo Lu , Jiaya Jia

Topics

Deep Learning > Architectures > Neural Networks Computer Vision > Analysis > Object Tracking Computer Vision > Analysis > Semantic Segmentation Computer Vision > Processing > Video Understanding Computer Vision > Analysis > Video Understanding Computer Vision > Analysis > Object Segmentation

Keywords

semantic segmentation video instance segmentation object tracking sequence propagation

Download PDF

Related papers

Spatial-Temporal Transformer for Dynamic Scene Graph Generation 2021

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 2021

A Broad Study on the Transferability of Visual Representations With Contrastive Learning 2021

Query Adaptive Few-Shot Object Detection With Heterogeneous Graph Convolutional Networks 2021

Self-Supervised Neural Networks for Spectral Snapshot Compressive Imaging 2021