On The Robustness of scRNA-seq Foundation Models for Plants Under Cross-Domain Experimental Shift
Abstract
Foundation models for single-cell transcriptomics promise to learn generalizable representations of cellular states, yet their robustness to cross-study distribution shift remains underexplored in plant systems. We introduce scAraFM, an Arabidopsis-specific foundation model, and evaluate its utility for stress prediction across leaf and root scRNA-seq datasets under three increasingly challenging protocols: random splits from a single experiment, replicate-based splits from a single experiment, and cross-experiment transfer learning. For single experiment settings, we find that random splits can overestimate performance by 20--30 AUROC points compared to replicate-held-out evaluation and across independent experiments, underscoring the need for study-aware validation in fragmented transcriptomic landscapes. Across representation strategies, gene-identity-preserving features consistently outperform pooled summaries, even when the latter are derived from pretrained transformers. Notably, simple baselines using raw reads remain competitive or superior to learned embeddings under single-experiment scenarios, challenging claims of universal advantage for foundation-model-derived features. Our promising results on cross-experiment transfer learning emphasize that evaluation design is as critical as model architecture, and that preserving per-gene structure aids generalization in downstream tasks.