Skip to yearly menu bar Skip to main content

Workshop: Deep Learning for Code

Generating Programming Puzzles to Train Language Models

Patrick Haluptzok · Matthew Bowers · Adam Tauman Kalai


This work shows how one can use large-scale Language Models (LMs) to automatically generate programming problems with verified solutions, in the form of “programming puzzles,” which can then in turn be used to fine-tune other LMs to solve more difficult programming puzzles. This work builds on two recent developments. First, LMs have achieved breakthroughs in non-trivial reasoning and algorithm implementation, generating code that can solve some intermediate level competitive programming problems. However, training code LMs involves curated sets of natural-language problem descriptions and source-code tests and solutions, which are limited in size. Second, a new format of programming challenge called a programming puzzle was introduced, which does not require a natural-language description and is directly specified by a source-code test. In this work we show how generating synthetic programming puzzles and solutions, verified for correctness by a Python interpreter, can be used to improve performance in solving test puzzles from P3, a public benchmark set of Python Programming Puzzles. It also opens the door to iterative self-improvement for LMs in future work.

Chat is not available.