Skip to yearly menu bar Skip to main content


ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code

Xiangru Tang ⋅ Yuliang Liu ⋅ Zefan Cai ⋅ Daniel Shao ⋅ Junjie Lu ⋅ Yichi Zhang ⋅ Zexuan Deng ⋅ Helan Hu ⋅ Kaikai An ⋅ Ruijun Huang ⋅ Shuzheng Si ⋅ Chen Sheng ⋅ Haozhe Zhao ⋅ Liang Chen ⋅ Tianyu Liu ⋅ Yujia Qin ⋅ Wangchunshu Zhou ⋅ Yilun Zhao ⋅ Zhiwei Jiang ⋅ Baobao Chang ⋅ Arman Cohan ⋅ Mark Gerstein

Abstract

Video

Chat is not available.