2025 IJCAI IJCAI 2025

Image-Enhanced Hybrid Encoding with Reinforced Contrastive Learning for Spatial Domain Identification in Spatial Transcriptomics

Abstract

Spatial transcriptomics integrates spatial, gene expression, and multichannel immunohistochemistry image data, enabling advanced insights into cellular organization. However, existing methods often struggle to effectively fuse these multimodal data, limiting their potential for accurate spatial domain identification. Here, we propose IE-HERCL (Image-Enhanced Hybrid Encoding with Reinforced Contrastive Learning), a novel framework designed to address this challenge. Specifically, IE-HERCL employs hybrid encoding to capture both the non-spatial features and spatial dependencies for both gene and image modalities via autoencoders and GraphSAGE, respectively. These features are then fused using cross-view attention mechanisms to generate the unified informative embedding. To enhance the representation learning capability, we introduce a reinforced contrastive learning strategy to mitigate the influences of false negative samples, where we detect potential positive counterparts with high-order random walks. In addition, the cluster alignment is dynamically refined through optimal transport, which ensures that the fused consensus representation is coherent and robust, enabling accurate spatial domain identification. Our approach achieves state-of-the-art performance on five image-enhanced spatial transcriptomics datasets, demonstrating its robustness and effectiveness in multimodal integration and spatial domain identification. IE-HERCL offers a powerful and innovative solution for advancing spatial transcriptomics analysis. The code is released on https://github.com/wdyi701/IE-HERCL.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio