2024 CVPR CVPR 2024

Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models

Abstract

Multi-task visual scene understanding aims to leverage the relationships among a set of correlated tasks which are solved simultaneously by embedding them within a uni- fied network. However most existing methods give rise to two primary concerns from a task-level perspective: (1) the lack of task-independent correspondences for distinct tasks and (2) the neglect of explicit task-consensual dependencies among various tasks. To address these issues we propose a novel synergy embedding models (SEM) which goes be- yond multi-task dense prediction by leveraging two innova- tive designs: the intra-task hierarchy-adaptive module and the inter-task EM-interactive module. Specifically the con- structed intra-task module incorporates hierarchy-adaptive keys from multiple stages enabling the efficient learning of specialized visual patterns with an optimal trade-off. In ad- dition the developed inter-task module learns interactions from a compact set of mutual bases among various tasks benefiting from the expectation maximization (EM) algo- rithm. Extensive empirical evidence from two public bench- marks NYUD-v2 and PASCAL-Context demonstrates that SEM consistently outperforms state-of-the-art approaches across a range of metrics.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio