ChatSpatial: Schema-Enforced Agentic Orchestration for Reproducible Spatial Transcriptomics Analysis
Abstract
Spatial transcriptomics has transformed our ability to study tissue architecture at molecular resolution. Yet analyzing these data demands navigating 60+ computational methods across incompatible Python and R ecosystems, creating a widening gap between experimental capability and analytical accessibility. LLM-based agents offer a path forward through natural language interaction, but code-generating approaches introduce hallucinated functions, incorrect parameters, and non-reproducible outputs---undermining the reliability that scientific workflows require. We introduce schema-enforced agentic orchestration, a paradigm in which the LLM selects from pre-validated tool schemas rather than generating free-form code. Built on the Model Context Protocol (MCP), our approach embeds domain expertise directly into schema descriptions for context-aware parameter inference and seamlessly bridges Python and R through automated data conversion. We instantiate this paradigm in ChatSpatial, a platform orchestrating 60+ methods across 15 analytical categories for spatial transcriptomics. Through replication of two published studies---recovering key biological findings including tumor microenvironment organization and subclonal heterogeneity---and systematic validation across 28 test scenarios on four platforms, we demonstrate that schema-enforced orchestration enables reproducible, multi-step discovery through conversation, supporting iterative loops from reasoning to computational experimentation.