Poster
in
Workshop: The 3rd DL4C Workshop: Emergent Possibilities and Challenges in Deep Learning for Code

Do LLMs Understand Code Preference? Training Code Preference Models via Synthetic Code Evolution

Jiawei Liu · THANH NGUYEN · Mingyue Shang · Hantian Ding · Xiaopeng Li · Yu Yu · Varun Kumar · Zijian Wang

Project Page [ OpenReview]

Abstract

Large Language Models (LLMs) have recently demonstrated remarkable coding capabilities. However, assessing code generation from verifiable properties and aligning it with developer preferences remains a challenge. In this paper, we explore two key questions under the new challenge of code preference learning: \textit{(i)} How to train models to predict meaningful preferences for code; and \textit{(ii)} how do code preferences based on verifiers, human, and neural models align with each other? To this end, we introduce \textsc{CodeFavor}, an open recipe to train pairwise code preference models using synthetic code evolution, including code commits and code critiques. We evaluate code preferences via \textsc{CodePrefBench}, a new benchmark with 1364 rigorously curated code preference tasks to cover three verifiable properties: correctness, efficiency, and security, along with human preference. Our evaluation shows that \textsc{CodeFavor} holistically improves model-based code preferences by up to $28.8%$. Our comprehensive controlled experiments also validate the design choices in \textsc{CodeFavor}. Furthermore, we quantified the cost and limitations of human-based code preference: \textit{(i)} Despite spending 23 person-minutes per task, $15\sim 40%$ of tasks remain unsolved; and \textit{(ii)} human preference is the most accurate on code correctness while underperforming model-based preferences on non-functional objectives.

Video

Chat is not available.