Skip to yearly menu bar Skip to main content


(6 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Fri Apr 24 11:15 AM -- 11:25 AM (PDT) @ 203 A/B None
SWINGARENA: Adversarial Programming Arena for Long-context GitHub Issue Solving
Wendong XU ⋅ Jing Xiong ⋅ Chenyang Zhao ⋅ Qiujiang Chen ⋅ Haoran Wang ⋅ Hui Shen ⋅ Zhongwei Wan ⋅ Jianbo Dai ⋅ Taiqiang Wu ⋅ He Xiao ⋅ Chaofan Tao ⋅ Zhuoqing Mao ⋅ Ying Sheng ⋅ Zhijiang Guo ⋅ Hongxia Yang ⋅ Bei Yu ⋅ Lingpeng Kong ⋅ Quanquan Gu ⋅ Ngai Wong
[ OpenReview
Oral
Fri Apr 24 11:27 AM -- 11:37 AM (PDT) @ 203 A/B None
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation via Lens of Dynamic Interactions
Nan Huo ⋅ Xiaohan Xu ⋅ Jinyang Li ⋅ Per Jacobsson ⋅ Shipei Lin ⋅ Bowen Qin ⋅ Binyuan Hui ⋅ Xiaolong Li ⋅ Ge Qu ⋅ Shuzheng Si ⋅ Linheng Han ⋅ Edward Alexander ⋅ Xintong Zhu ⋅ Rui Qin ⋅ Ruihan Yu ⋅ Yiyao Jin ⋅ Feige Zhou ⋅ Weihao Zhong ⋅ Yun Chen ⋅ Hongyu Liu ⋅ Chenhao Ma ⋅ Fatma Ozcan ⋅ Yannis Papakonstantinou ⋅ Reynold Cheng
[ OpenReview
Oral
Fri Apr 24 11:39 AM -- 11:49 AM (PDT) @ 203 A/B None
EditBench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits
Wayne Chi ⋅ Valerie Chen ⋅ Ryan Shar ⋅ Aditya Mittal ⋅ Jenny Liang ⋅ Wei-Lin Chiang ⋅ Anastasios Angelopoulos ⋅ Ion Stoica ⋅ Graham Neubig ⋅ Ameet Talwalkar ⋅ Chris Donahue
[ OpenReview
Oral
Fri Apr 24 11:51 AM -- 12:01 PM (PDT) @ 203 A/B None
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
Yueqi Song ⋅ Ketan Ramaneti ⋅ Zaid Sheikh ⋅ Ziru Chen ⋅ Boyu Gou ⋅ Tianbao Xie ⋅ Yiheng Xu ⋅ Danyang Zhang ⋅ Apurva Gandhi ⋅ Fan Yang ⋅ Joseph Liu ⋅ Tianyue Ou ⋅ Zhihao Yuan ⋅ Frank F Xu ⋅ Shuyan Zhou ⋅ Xingyao Wang ⋅ Xiang Yue ⋅ Tao Yu ⋅ Huan Sun ⋅ Yu Su ⋅ Graham Neubig
[ OpenReview
Oral
Fri Apr 24 12:03 PM -- 12:13 PM (PDT) @ 203 A/B None
AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite
Jonathan Bragg ⋅ Mike D'Arcy ⋅ Nishant Balepur ⋅ Dan Bareket ⋅ Bhavana Dalvi Mishra ⋅ Sergey Feldman ⋅ Dany Haddad ⋅ Jena Hwang ⋅ Peter Jansen ⋅ Varsha Kishore ⋅ Bodhisattwa Prasad Majumder ⋅ Aakanksha Naik ⋅ Sigal Rahamimov ⋅ Kyle Richardson ⋅ Amanpreet Singh ⋅ Harshit Surana ⋅ Aryeh Tiktinsky ⋅ Rosni Vasu ⋅ Guy Wiener ⋅ Chloe Anastasiades ⋅ Stefanus Candra ⋅ Jason Dunkelberger ⋅ Daniel Emery ⋅ Rob Evans ⋅ Malachi Hamada ⋅ Regan Huff ⋅ Rodney Kinney ⋅ Matt Latzke ⋅ Jaron Lochner ⋅ Ruben Lozano-Aguilera ⋅ Ngoc-Uyen Nguyen ⋅ Smita Rao ⋅ Amber Tanaka ⋅ Brooke Vlahos ⋅ Peter Clark ⋅ Doug Downey ⋅ Yoav Goldberg ⋅ Ashish Sabharwal ⋅ Daniel Weld
[ OpenReview
Oral
Fri Apr 24 12:15 PM -- 12:25 PM (PDT) @ 203 A/B None
MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science
Ran Xu ⋅ Yuchen Zhuang ⋅ Yishan Zhong ⋅ Yue Yu ⋅ Zifeng Wang ⋅ Xiangru Tang ⋅ Hang Wu ⋅ May Dongmei Wang ⋅ Peifeng Ruan ⋅ Donghan Yang ⋅ Tao Wang ⋅ Guanghua Xiao ⋅ Xin Liu ⋅ Carl Yang ⋅ Yang Xie ⋅ Wenqi Shi
[ OpenReview