Skip to yearly menu bar Skip to main content


Poster

miniCTX: Neural Theorem Proving with (Long-)Contexts

Jiewen Hu · Thomas Zhu · Sean Welleck

Hall 3 + Hall 2B #248
[ ] [ Project Page ]
Thu 24 Apr midnight PDT — 2:30 a.m. PDT
 
Oral presentation: Oral Session 1A
Wed 23 Apr 7:30 p.m. PDT — 9 p.m. PDT

Abstract: Real-world formal theorem proving often depends on a wealth of context, including definitions, lemmas, comments, file structure, and other information. We introduce miniCTX, which tests a model's ability to prove formal mathematical theorems that depend on new context that is not seen during training. miniCTX contains theorems sourced from real Lean projects and textbooks, each associated with a context that can span tens of thousands of tokens. Models are tasked with proving a theorem given access to code from the theorem's repository, which contains context that is needed for the proof. As a baseline for miniCTX, we tested fine-tuning and prompting methods that condition theorem proving on preceding context. Both approaches substantially outperform traditional methods that rely solely on state information. We found that this ability to use context is not captured by previous benchmarks such as miniF2F. Alongside miniCTX, we offer ntp-toolkit for automatically extracting and annotating theorem proving data, making it easy to add new projects into miniCTX to ensure that contexts are not seen during training. miniCTX offers a challenging and realistic evaluation of neural theorem provers.

Live content is unavailable. Log in and register to view live content