Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

Tsung-Wei Ke; jyh-Jing Hwang; Yunhui Guo; Xudong Wang; Stella X. Yu

2022 CVPR CVPR 2022

Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

Abstract

Unsupervised semantic segmentation aims to discover groupings within and across images that capture object- and view-invariance of a category without external supervision. Grouping naturally has levels of granularity, creating ambiguity in unsupervised segmentation. Existing methods avoid this ambiguity and treat it as a factor outside modeling, whereas we embrace it and desire hierarchical grouping consistency for unsupervised segmentation. We approach unsupervised segmentation as a pixel-wise feature learning problem. Our idea is that a good representation must be able to reveal not just a particular level of grouping, but any level of grouping in a consistent and predictable manner across different levels of granularity. We enforce spatial consistency of grouping and bootstrap feature learning with co-segmentation among multiple views of the same image, and enforce semantic consistency across the grouping hierarchy with clustering transformers. We deliver the first data-driven unsupervised hierarchical semantic segmentation method called Hierarchical Segment Grouping (HSG). Capturing visual similarity and statistical co-occurrences, HSG also outperforms existing unsupervised segmentation methods by a large margin on five major object- and scene-centric benchmarks.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tsung-Wei Ke , jyh-Jing Hwang , Yunhui Guo , Xudong Wang , Stella X. Yu

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Learning Types > Unsupervised Learning Computer Vision > Processing > Semantic Segmentation Deep Learning > Models > Transformers

Keywords

unsupervised learning transformer architecture semantic segmentation feature learning hierarchical clustering multi-view learning hierarchical grouping

Download PDF

Related papers

UniCoRN: A Unified Conditional Image Repainting Network 2022

Why Discard if You Can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis 2022

All-in-One Image Restoration for Unknown Corruption 2022

Stability-Driven Contact Reconstruction From Monocular Color Images 2022

Forecasting Characteristic 3D Poses of Human Actions 2022