Skip to yearly menu bar Skip to main content


Multimodal Patient Representation Learning with Missing Modalities and Labels

Zhenbang Wu · Anant Dadu · Nicholas Tustison · Brian Avants · Michael Nalls · Jimeng Sun · Faraz Faghri

Halle B #31
[ ]
Tue 7 May 1:45 a.m. PDT — 3:45 a.m. PDT


Multimodal patient representation learning aims to integrate information from multiple modalities and generate comprehensive patient representations for subsequent clinical predictive tasks. However, many existing approaches either presuppose the availability of all modalities and labels for each patient or only deal with missing modalities. In reality, patient data often comes with both missing modalities and labels for various reasons (i.e., the missing modality and label issue). Moreover, multimodal models might over-rely on certain modalities, causing sub-optimal performance when these modalities are absent (i.e., the modality collapse issue). To address these issues, we introduce MUSE: a mutual-consistent graph contrastive learning method. MUSE uses a flexible bipartite graph to represent the patient-modality relationship, which can adapt to various missing modality patterns. To tackle the modality collapse issue, MUSE learns to focus on modality-general and label-decisive features via a mutual-consistent contrastive learning loss. Notably, the unsupervised component of the contrastive objective only requires self-supervision signals, thereby broadening the training scope to incorporate patients with missing labels. We evaluate MUSE on three publicly available datasets: MIMIC-IV, eICU, and ADNI. Results show that MUSE outperforms all baselines, and MUSE+ further elevates the absolute improvement to ~4% by extending the training scope to patients with absent labels.

Chat is not available.