Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis

Nahid Ul Islam; DongAo Ma; Jiaxuan Pang; Shivasakthi Senthil Velan; Michael Gotway; Jianming Liang

2025 WACV WACV 2025

Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis

Abstract

Developing robust and versatile deep-learning models is essential for enhancing diagnostic accuracy and guiding clinical interventions in medical imaging but it requires a large amount of annotated data. The advancement of deep learning has facilitated the creation of numerous medical datasets with diverse expert-level annotations. Aggregating these datasets can maximize data utilization and address the inadequacy of labeled data. However the heterogeneity of expert-level annotations across tasks such as classification localization and segmentation presents a significant challenge for learning from these datasets. To this end we introduce Foundation X an end-to-end framework that utilizes diverse expert-level annotations from numerous public datasets to train a foundation model capable of multiple tasks including classification localization and segmentation. To address the challenges of annotation and task heterogeneity we propose a Lock-Release pretraining strategy to enhance the cyclic learning from multiple datasets combined with the student-teacher learning paradigm ensuring the model retains general knowledge for all tasks while preventing overfitting to any single task. To demonstrate the effectiveness of Foundation X we trained a model using 11 chest X-ray datasets covering annotations for classification localization and segmentation tasks. Our experimental results show that Foundation X achieves notable performance gains through extensive annotation utilization excels in cross-dataset and cross-task learning and further enhances performance in organ localization and segmentation tasks. All code and pretrained models are publicly accessible at GitHub.com/JLiangLab/Foundation_X.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nahid Ul Islam , DongAo Ma , Jiaxuan Pang , Shivasakthi Senthil Velan , Michael Gotway , Jianming Liang

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Application Areas > Domain Adaptation Computer Vision > Domain-Specific > Medical Imaging Deep Learning > Models > Foundation Models Deep Learning > Techniques > Transfer Learning

Keywords

semantic segmentation multi-task learning transfer learning medical imaging knowledge distillation foundation model chest x-ray

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025