2026 AAAI AAAI 2026

Interest-driven Deep Multi-modal Clustering

Abstract

Abstract Deep multi-modal clustering fully learns semantically consistent and discriminative cluster representations between multiple modalities in an unlabeled manner. However, existing methods treat all samples equally, ignoring varying sample quality, which limits clustering performance. Inspired by the concept of interest in the recommendation system, we propose a novel interest-driven deep multi-modal clustering (IDMC) framework. It designs a new paradigm to quantify the importance of each sample base on the attention it receives from other samples, which called interest value. This value jointly captures the local geometric structure through the Euclidean distance in feature space and the consistency of pseudo-labels. Then, we design a novel adaptive Bayesian fusion mechanism to dynamically balance the prior features and self-supervisory signals to ensure confidence-based sample importance estimation. Furthermore, we introduce a median normalization constraint and a label consistency constraint to further refine the construction of the interest value. By embedding this interest-guided value into representation learning and cluster optimization, IDMC focuses on the samples with the most information and the most stable semantics, thereby enhancing the performance of multi-modal representation learning. Extensive experiments verify that IDMC is superior to existing state-of-the-art methods in multiple evaluation metrics.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🧭 Keyword Pioneer — interest value
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio