Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining
Abstract
Neural recordings exhibit a distinctive form of heterogeneity rooted in differences in cell types, intrinsic circuit dynamics, and stochastic stimulus–response variability that goes beyond ordinary dataset variability, mixing statistically regular neurons with highly stochastic, stimulus-contingent ones within the same dataset. This heterogeneity poses a challenge for self-supervised learning (SSL)—learnable statistical regularity—thereby destabilizing representation learning and limiting reliable scaling. We introduce POYO-CAP (Cell-pattern Aware Pretraining), a biologically grounded hybrid pretraining strategy that first trains with masked reconstruction plus lightweight auxiliary supervision on statistically regular neurons—identified via skewness and kurtosis—and then fine-tunes on more stochastic populations. On the Allen Brain Observatory dataset, this curriculum yields 12–13% relative improvements over from-scratch training and enables smooth, monotonic scaling with model size, whereas baselines trained on mixed populations plateau or destabilize. By making statistical predictability an explicit data-selection criterion, POYO-CAP turns neural heterogeneity into a scalable learning advantage for robust neural decoding.