PM2F2N: Patient Multi-view Multi-modal Feature Fusion Networks for Clinical Outcome Prediction
Abstract
AbstractClinical outcome prediction is critical to the condition prediction of patients and management of hospital capacities. There are two kinds of medical data, including time series signals recorded by various devices and clinical notes in electronic health records (EHR), which are used for two common prediction targets: mortality and length of stay. Traditional methods focused on utilizing time series data but ignored clinical notes. With the development of deep learning, natural language processing (NLP) and multi-modal learning methods are exploited to jointly model the time series and clinical notes with different modals. However, the existing methods failed to fuse the multi-modal features of patients from different views. Therefore, we propose the patient multi-view multi-modal feature fusion networks for clinical outcome prediction. Firstly, from patient inner view, we propose to utilize the co-attention module to enhance the fine-grained feature interaction between time series and clinical notes from each patient. Secondly, the patient outer view is the correlation between patients, which can be reflected by the structural knowledge in clinical notes. We exploit the structural information extracted from clinical notes to construct the patient correlation graph, and fuse patients’ multi-modal features by graph neural networks (GNN). The experimental results on MIMIC-III benchmark demonstrate the superiority of our method.