Poster Thu, Apr 23, 2026 • 11:15 AM – 1:45 PM PDT

Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration

Qi Liu · Kangjie Bao · Yue Yang · Xinhao Zheng · Renqiu Xia · Qinxiang Cao · Junchi Yan

Abstract

Mathematical problem synthesis shows promise in resolving data exhaustion, contamination, and leakage for AI training and evaluation. Despite enormous efforts, an **expressiveness-validity-complexity trilemma** remains an open question. Existing methods either lack whole-process verifiability, are constrained to a particular domain, or are bounded by external models. This paper breaks the trilemma by proposing the framework of **DExploration** _(**D**eductive **Exploration**)_, which formulates problem synthesis as a step-by-step exploration process instead of one-shot generation. Agents are equipped with three simple yet powerful atomic actions: _introducing_ variables/hypotheses, _deducing_ new facts, and _submitting_ derived facts. The entire exploration process is formally verified by Lean 4, which encompasses most mathematical domains up to the research level. Once a conclusion is submitted, the framework outputs a formal statement with guaranteed provability, reducing the need for external models. To bootstrap training data for DExploration, we propose **Exploratory Transformation** to distill exploration trajectories from existing large-scale theorem-proving data. It rewrites formal proofs into a deductive style, parses dependencies among variables, hypotheses, and proof steps, then reassembles them into exploration trajectories by a topological order. Experiments validate the effectiveness and efficiency of our methods, achieving an improved success rate ($40.70\\% \mapsto 54.52\\%$), reduced token cost ($52.9\text{K} \mapsto 8.8\text{K}, 83\\%\downarrow$), broader complexity and difficulty distributions, and Pareto optimality. In $2726$ valid generations, three state-of-the-art provers fail on $60$ (Pass@4) and $8$ (Pass@64). Code, data, and models will be available.