Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models

Huimin Huang; Yawen Huang; Lanfen Lin; Ruofeng Tong; Yen-Wei Chen; Hao Zheng; Yuexiang Li; Yefeng Zheng

2024 CVPR CVPR 2024

Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models

Abstract

Multi-task visual scene understanding aims to leverage the relationships among a set of correlated tasks which are solved simultaneously by embedding them within a uni- fied network. However most existing methods give rise to two primary concerns from a task-level perspective: (1) the lack of task-independent correspondences for distinct tasks and (2) the neglect of explicit task-consensual dependencies among various tasks. To address these issues we propose a novel synergy embedding models (SEM) which goes be- yond multi-task dense prediction by leveraging two innova- tive designs: the intra-task hierarchy-adaptive module and the inter-task EM-interactive module. Specifically the con- structed intra-task module incorporates hierarchy-adaptive keys from multiple stages enabling the efficient learning of specialized visual patterns with an optimal trade-off. In ad- dition the developed inter-task module learns interactions from a compact set of mutual bases among various tasks benefiting from the expectation maximization (EM) algo- rithm. Extensive empirical evidence from two public bench- marks NYUD-v2 and PASCAL-Context demonstrates that SEM consistently outperforms state-of-the-art approaches across a range of metrics.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Huimin Huang , Yawen Huang , Lanfen Lin , Ruofeng Tong , Yen-Wei Chen , Hao Zheng , Yuexiang Li , Yefeng Zheng

Topics

Machine Learning > Optimization & Theory > Optimization Computer Vision > Analysis > Scene Understanding

Keywords

multi-task learning scene understanding dense prediction neural network

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024