Skip to yearly menu bar Skip to main content


GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning

JINGYI WANG ⋅ Lei Zhu ⋅ Tengjin Weng ⋅ Wu ⋅ Haochen Tan ⋅ Jierun Chen ⋅ Chaofan Tao ⋅ Haoli Bai ⋅ LU HOU ⋅ Lifeng Shang ⋅ Xiao-Ping Zhang

Abstract

Chat is not available.