Poster
in
Workshop: From Human Cognition to AI Reasoning: Models, Methods, and Applications

From Examples to Solutions: A Cognitive Framework for LLM Code Generation

Shashwat Saxena ⋅ Shreyas Kowshik ⋅ Vikhyath Kothamasu

Project Page [ OpenReview]

Abstract

When learning to solve coding problems, humans rarely approach new challenges from scratch. Instead, they study worked examples that reveal solution patterns and edge cases. This cognitive strategy, well-documented in educational psychology, has been largely overlooked in training LLMs for code generation. In this work, we ask: can incorporating worked examples as explicit intermediate representations improve LLM code generation via reinforcement learning? We introduce COACH (COgnitive Abstraction Conditioning for code Help), a framework that decomposes code generation into two stages: an example generator that produces step-by-step solved examples for a given problem, and a solution generator that conditions on these examples to produce code. Both models are trained jointly using Group Relative Policy Optimization (GRPO), receiving the same execution-based reward signal. This shared reward structure incentivizes the example generator to produce examples that genuinely aid solution generation rather than superficial reasoning. On the MBPP benchmark, COACH achieves 49% pass@1 accuracy compared to 37% for vanilla GRPO - a 32% relative improvement. COACH also demonstrates improved sample efficiency, achieving comparable performance to the baseline with less than 2/5 of the training data. Qualitatively, we find that COACH’s intermediate representations help the model handle edge cases that end-to-end approaches miss. Our results suggest that explicitly modeling human reasoning patterns, specifically, the use of “worked examples” as reasoning scaffolds offers a promising direction for more effective and interpretable code generation systems.

Chat is not available.