Latent-Implicit Thinking with Proof-Carrying Neuro-Symbolic Outputs for Biomedical Discovery
Abstract
Recent work on latent reasoning—where large language models (LLMs) perform intermediate computation in continuous representation spaces rather than generating explicit token chains—achieves dramatic efficiency gains (80–90% token reduction) but sacrifices the transparency that makes chain-of-thought (CoT) reasoning auditable. We propose Latent-to-Symbolic Compilation (LaSy), a four-component pipeline that enables models to reason efficiently in latent space while emitting proof-carrying structured outputs: causal graphs with typed edges, mechanistic constraints, and minimal verification warrants. The pipeline comprises a latent reasoner for continuous thought evolution, a symbolic extractor that decodes latent states into formal graph structures, a constraint verifier that checks domain axioms, and a warrant emitter that produces sparse evidence certificates. We evaluate LaSy on 50 reasoning tasks across three scientific domains and demonstrate that it matches latent-only efficiency (45 tokens vs. 319 for explicit CoT) while achieving 93% constraint satisfaction—compared to 63% for unverified latent reasoning. In a case study on NAD+-centered Alzheimer’s disease reversal, LaSy discriminates between three competing mechanistic hypotheses and generates falsifiable experimental proposals. Faithfulness probing reveals that latent states encode semantically meaningful structure (89.6% linear probe accuracy for causal direction), providing evidence that implicit reasoning is not opaque but rather compressed.