Game Reasoning in Humans and Machines
Abstract
Games have long been a productive testbed for research in both natural and artificial intelligence. Classic studies on human chess playing led to influential theories of planning and problem solving. Systems such as Deep Blue, AlphaGo, Pluribus, and Cicero represent landmark achievements in AI. Yet much of this work has focused on expert or superhuman performance. A remarkable feature of human intelligence, by contrast, is that we can make intuitively reasonable judgments and decisions even without much experience. In this talk, I present recent work on modeling how people reason about novel games. We show that people are systematic and adaptively rational in how they play a game for the first time and evaluate a game (e.g., how fair or how fun it is likely to be) before they have played it even once. We explain these capacities via the Intuitive Gamer, a computational model based on mechanisms of fast and flat goal-directed probabilistic simulation. Across large-scale behavioral studies with over 1000 participants and 121 two-player strategic board games, our model quantitatively captures human judgments and decisions. I then show how well a battery of language models reasons about such games compared to human data. Reasoning models are generally more aligned with people in their evaluations of games than non-reasoning language models, but there are still notable differences. More broadly, our work offers insights into how people effectively think about novel problems and could inform the design of aligned AI systems that determine not just how to solve tasks, but whether a task is worth thinking about at all.