Poster
in
Affinity Event: Tiny Papers Poster Session 3

The IMO Small Challenge: First IMO Dataset for LLMs

Simon Frieder ⋅ Mirek Olšák ⋅ Julius Berner ⋅ Thomas Lukasiewicz

2024 Poster
in
Affinity Event: Tiny Papers Poster Session 3

Project Page [ OpenReview]

Abstract

We introduce the IMO Small Challenge: A curated collection of the easiest possible IMO problems and other competitive mathematical problems. The goal is to bridge the existing gap in the range of available dataset difficulties in terms of testing problem-solving skills: Currently, datasets are predominantly either too easy (MATH or GSM8K), excessively challenging (solving arbitrary IMO problems, such as required by the IMO Grand Challenge, and embodied by the miniF2F dataset) or focus too little on problem-solving (\emph{GHOSTS}). Our challenge interpolates this difficulty range and serves as a test bench for next-generation language models. We release a preliminary version of a dataset that accompanies this challenge. It is grounded in natural language, and problems are annotated with solutions and other metadata, such as the type of proof strategy used, in order to facilitate semi-automatic evaluation of LLMs' outputs beyond classical correct-incorrect keyword matching.

Video

Chat is not available.