2025 ICCV ICCV 2025

MPBR: Multimodal Progressive Bidirectional Reasoning for Open-Set Fine-Grained Recognition

Abstract

Open-set fine-grained recognition (OSFGR) is the core exploration of building open-world intelligent systems. The challenge lies in the gradual semantic drift during the transition from coarse-grained to fine-grained categories. However, although existing methods leverage hierarchical representations to assist progressive reasoning, they neglect semantic consistency across hierarchies. To address this, we propose a multimodal progressive bidirectional reasoning framework: (1) In forward reasoning, the model progressively refines visual features to capture hierarchical structural representations, while (2) in backward reasoning, variational inference integrates multimodal information to constrain consistency in category-aware latent spaces. This mechanism mitigates semantic drift through bidirectional information flow and cross-hierarchical feature consistency constraints. Extensive experiments on the iNat2021-OSR dataset demonstrate that our proposed method achieves superior performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio