NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference
Abstract
When users query proprietary LLM APIs, they receive outputs with no cryptographic assurance that the claimed model was actually used. Service providers could substitute cheaper models, apply aggressive quantization, or return cached responses—all undetectable by users paying premium prices for frontier capabilities. We present NANOZK, a zero-knowledge proof system that makes LLM inference verifiable: users can cryptographically confirm that outputs correspond to a specific model's computation. Our key insight is that transformer inference naturally decomposes into independent layer computations, enabling a layerwise proof framework where each layer generates a constant-size proof regardless of model width. This decomposition sidesteps the scalability barrier facing monolithic approaches and enables parallel proving. We develop lookup table approximations for non-arithmetic operations (softmax, GELU, LayerNorm) that introduce zero measurable accuracy loss, and introduce Fisher information-guided verification for scenarios where proving all layers is impractical. On GPT-2 scale transformers, NANOZK generates proofs in 43 seconds with 6.9KB proof size and 23ms verification time—achieving 52× speedup over EZKL while maintaining formal soundness guarantees (epsilon < 10^-37). Lookup approximations preserve model perplexity exactly, enabling verification without quality compromise.