Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
scvi-hub: A flexible framework for reference enabled single-cell data analysis
Can Ergen
The accumulation of single-cell omics datasets in the public domain has openednew opportunities to reuse and leverage the vast amount of information they con-tain. Such uses, however, are complicated by the need for complex and resource-consuming procedures for data transfer, normalization, and integration that mustbe addressed prior to any analysis. Here we present scvi-hub: a platform for evalu-ating, sharing, and accessing probabilistic models that were trained on single-cellomics datasets. We demonstrate that these pre-trained models allow immediateaccess to a slew of fundamental tasks like visualization, imputation, annotation,outlier detection, and deconvolution of new (query) datasets with a much lowerrequirement for compute resources. We also show that pretrained models can helpdrive new discoveries with the existing (reference) datasets through rapid, model-based analyses. Scvi-hub is built within scvi-tools and integrated into scverse.Scvi-hub is publicly available to enable efficient sharing of single-cell omic stud-ies, and also to put advanced capabilities for transfer learning at the fingertips ofa broad community of users. We provide an extended journal version on bioRxiv