Autoformalizing Biomedical Text into Verified Knowledge Graph Reasoning: A Neuro-Symbolic Architecture for Alzheimer's Disease
Abstract
Alzheimer’s disease (AD) research generates vast amounts of unstructured biomedical text—clinical protocols, biomarker studies, and mechanistic hypotheses—that remain disconnected from formal computational reasoning. We introduce a neuro-symbolic architecture that autoformalizes biomedical text into a typed, verifiable knowledge graph (AD-KG), enabling auditable reasoning over AD biomarkers, patient stratification, and trial-protocol verification. Large language models serve as proposers that translate natural-language descriptions into typed logical predicates, while Answer Set Programming (ASP) solvers and temporal-logic checkers serve as verifiers that enforce machine-checkable consistency. We evaluate three core capabilities on newly constructed benchmarks: (1) Autoformalization of biomarker sentences, where retrieval-augmented formalization raises Entity F1 from 0.362 to 0.414 over an LLM-only baseline; (2) Clinical reasoning on multi-hop questions, where Chain-of-Thought achieves the highest accuracy (86.7%) while Neuro-Symbolic CoT (NS-CoT) reduces inconsistency rate from 81.1% to 45.6% at the cost of lower verification rate; and (3) Protocol verification on trial protocols, where all methods achieve 0.605 F1 with perfect recall, indicating that current symbolic verification does not yet differentiate from neural-only approaches at small scale. These results demonstrate that neuro-symbolic integration provides measurable benefits for inconsistency reduction in clinical reasoning, while highlighting areas where further development is needed for autoformalization and protocol verification.