Lean4Physics: Comprehensive Reasoning Framework for College-level Physics in Lean4
Abstract
We present Lean4PHYS, a comprehensive reasoning framework for college-level physics problems in Lean4. To establish a solid foundation for formal reasoning in physics, Lean4PHYS launches PhysLib, a repository containing fundamental unit systems and essential theorems to formulate physics proofs in Lean4. It will be community-driven and long-term maintained. Lean4PHYS also includes LeanPhysBench, a college-level benchmark for evaluating LLMs' Lean4 formal physics reasoning capability. It contains 200 hand-crafted and peer-reviewed Lean4 theorem statements formalized from university textbooks and physics competition problems. Based on the PhysLib and LeanPhysBench we composed in Lean4PHYS, we perform exhaustive experiments of baseline results using major expert Math provers and state-of-the-art closed-source models, and provide an analysis of their performance. In the experiment, we identify that most expert provers do not outperform general models as they did in the math domain. This suggests potential overfitting to the math domain rather than learning formal reasoning for formal provers. We also conduct a comprehensive experiment showing that, with PhysLib in the context, LLMs' performance on LeanPhysBench increases by 11.90% on average, proving the effectiveness of our repository in assisting LLMs in solving the Lean4 physics problem. To the best of our knowledge, we are the first study to provide a physics benchmark in Lean4.