Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
Knockoff Statistics-Driven Interpretable Deep Learning Models for Uncovering Potential Biomarkers for COVID-19 Severity Prediction
Qian Liu · Daryl Xing · Huanjing Liu · Pingzhao Hu
COVID-19 affects individuals differently, with some experiencing severe symptoms while others remain asymptomatic. Identifying genetic determinants behind this variability can improve disease management, resource allocation, and public health decisions. Traditional approaches like genome-wide association studies and polygenic risk scores offer limited interpretability and predictive accuracy. In thisstudy, we developed an computational framework that involves deep generative model and xAI to predict COVID-19 severity based on whole-genome data. Our framework identified 72 significant genetic markers and achieved an improved prediction performance (ROC-AUC = 0.64) using whole-genome data from 6752 samples in Canada’s CGEn HostSeq project. Among these markers, 50 are novel, linked to hematopoietic stem cell differentiation, lung fibrosis, and SARS-CoV-2 mitochondrial interactions. This study introduces an interpretable AI tool for personalized COVID-19 severity prediction.