2025 CVPR CVPR 2025

Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery

Abstract

Phenotypic drug discovery presents a promising strategy for identifying first-in-class drugs by bypassing the need for specific drug targets. Recent advances in cell-based phenotypic screening tools, including Cell Painting and the LINCS L1000, provide essential cellular data that capture biological responses to compounds. While the integration of the multi-modal data enhances the use of contrastive learning (CL) methods for molecular phenotypic representation, these approaches treat all negative pairs equally, failing to discriminate molecules with similar phenotypes. To address these challenges, we introduce a foundational framework MINER that dynamically estimates the likelihoods of sample pairs as negative pairs based on uni-modal disentangled representations. In addition, our approach incorporates a mixture fusion strategy to effectively integrate multimodal data, even in cases where certain modalities are missing. Extensive experiments demonstrate that our method enhances both molecular property prediction and molecule-phenotype retrieval accuracy. Moreover, it successfully recommends drug candidates from phenotype for complex diseases documented in the literature. These findings underscore MINER's potential to advance drug discovery by enabling deeper insights into disease mechanisms and improving drug candidate recommendations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio