Cross-subject decoding of human neural data for speech Brain Computer Interfaces
Abstract
Brain-to-text systems have recently achieved impressive performance when trained on single-participant data, but remain limited by uninvestigated cross-subject generalization. We present the first neural-to-phoneme decoder trained jointly on the two largest intracortical speech datasets (Willett et al. 2023; Card et al. 2024), introducing day- and dataset-specific affine transforms to align neural activity into a shared space. A hierarchical GRU decoder with intermediate CTC supervision and feedback connections further mitigates the conditional-independence assumption of standard CTC loss. Our model matches or outperforms within-subject baselines while being trained across participants, and adapts to unseen subjects using only a linear transform or brief fine-tuning. On an independent inner-speech dataset (Kunz et al. 2025), our approach demonstrate generalization, by training only subject day specific transforms. These results highlight cross-subject pretraining as a practical path toward scalable and clinically deployable speech BCIs.