Workshop
Workshop on Reasoning and Planning for Large Language Models
Zhiyuan Hu · Yilun Zhao · Xidong Feng · Min-Yen Kan · Nouha Dziri · Yali Du · Pang Wei Koh · Bryan Hooi · Arman Cohan
This workshop explores the growing capabilities of large language models (LLMs), such as OpenAI's o1 model, in reasoning, planning, and decision-making, highlighting recent advances and challenges. We aim to examine how reinforcement learning methods, post-training optimization, and efficient inference techniques can further enhance LLMs' reasoning capabilities. Topics include training approach for enhancing reasoning and planning abilities, scaling inference for complex tasks, developing robust benchmarks, and extending LLMs to multi-modal and embodied environments. We will also discuss broader themes such as causal reasoning, collaborative multi-agent systems, uncertainty, and explainability to offer insights and guidance for the further development of reasoning and planning in LLMs.
Schedule
|
Sun 5:30 p.m. - 5:40 p.m.
|
Introduction and Opening Remarks
SlidesLive Video |
🔗 |
|
Sun 5:40 p.m. -
|
Keynote Talk: Yuandong Tian (Meta)
SlidesLive Video |
🔗 |
|
Sun 5:40 p.m. - 6:10 p.m.
|
Keynote Talk: Yuandong Tian (Meta): Reason by Search or by Representation? A Path Towards Unifying Neural and Symbolic Decision Making
|
🔗 |
|
Sun 6:10 p.m. - 6:40 p.m.
|
Keynote Talk: Guy Van den Broeck (UCLA): Symbolic Reasoning about Large Language Models
SlidesLive Video |
🔗 |
|
Sun 6:50 p.m. - 7:20 p.m.
|
Keynote Talk: Yarin Gal(Oxford)
SlidesLive Video |
🔗 |
|
Sun 7:20 p.m. - 7:50 p.m.
|
Keynote Talk: Natasha Jaques (UW& Google DeepMind)
SlidesLive Video |
🔗 |
|
Sun 7:50 p.m. - 8:50 p.m.
|
Panel Discussion: Yuandong Tian, Junxian He, Yarin Gal, Bo An. Host: Xidong Feng
SlidesLive Video |
🔗 |
|
Sun 9:00 p.m. - 10:30 p.m.
|
Poster Session 1 and lunch Break
|
🔗 |
|
Sun 10:45 p.m. - 11:15 p.m.
|
Keynote Talk: Stephen McAleer (OpenAI)
SlidesLive Video |
🔗 |
|
Sun 11:15 p.m. - 11:55 p.m.
|
Oral Paper1 & 2 & 3 & 4
SlidesLive Video |
🔗 |
|
Sun 11:55 p.m. - 1:00 a.m.
|
Poster Session 2
|
🔗 |
|
Mon 1:00 a.m. - 1:30 a.m.
|
Keynote Talk: Bo An (NTU)
SlidesLive Video |
🔗 |
|
Mon 1:30 a.m. - 2:00 a.m.
|
Keynote Talk: Junxian He (HKUST): Taming Reinforcement Learning for Effective and Efficient Reasoners
|
🔗 |
|
Mon 2:00 a.m. - 2:30 a.m.
|
Oral Paper 5 & 6 & 7
SlidesLive Video |
🔗 |
|
Mon 2:30 a.m. - 2:40 a.m.
|
Paper Award & Closing Remarks
SlidesLive Video |
🔗 |
|
-
|
The in-context inductive biases of vision-language models differ across modalities ( Poster ) > link | Kelsey Allen · Eliza Kosoy · Ishita Dasgupta · Andrew Lampinen 🔗 |
|
-
|
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection ( Poster ) > link |
16 presentersYibo Yan · Shen Wang · Jiahao Huo · Hang Li · BOYAN LI · Jiamin Su · Xiong Gao · YiFan Zhang · Tianlong Xu · Zhendong Chu · Aoxiao Zhong · Kun Wang · Hui Xiong · Philip Yu · Xuming Hu · Qingsong Wen |
|
-
|
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension ( Poster ) > link | Xiyao Wang · Zhengyuan Yang · Linjie Li · Hongjin Lu · Yuancheng Xu · Chung-Ching Lin · Kevin Lin · Furong Huang · Lijuan Wang 🔗 |
|
-
|
Reasoning Effort and Problem Complexity: A Scaling Analysis in LLMs ( Poster ) > link | Benjamin Estermann · Roger Wattenhofer 🔗 |
|
-
|
Adaptive Self-improvement LLM Agentic System for ML Library Development ( Poster ) > link | Genghan Zhang · Victor Weixin Liang · Olivia Hsu · Kunle Olukotun 🔗 |
|
-
|
Evolutionary Prompt Optimization Discovers Emergent Multimodal Reasoning Strategies in Vision-Language Models ( Poster ) > link | Sid Bharthulwar · John Rho · Katrina Brown 🔗 |
|
-
|
PHYSICS: Benchmarking Foundation Models for PhD-Qualifying Exam Physics Problem Solving ( Poster ) > link | Kaiyue Feng · Yilun Zhao · Yixin Liu · Tianyu Yang · Chen Zhao · John Sous · Arman Cohan 🔗 |
|
-
|
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning ( Poster ) > link | Wanjia Zhao · Mert Yuksekgonul · Shirley Wu · James Y Zou 🔗 |
|
-
|
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory ( Poster ) > link | Nikola Zubic · Federico Soldà · Aurelio Sulser · Davide Scaramuzza 🔗 |
|
-
|
Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms ( Poster ) > link | Xuerui Su · Yue Wang · Jinhua Zhu · Mingyang Yi · Feng Xu · Zhi-Ming Ma · Yuting Liu 🔗 |
|
-
|
Chain-of-Timeline: Enhancing LLM Zero-Shot Temporal Reasoning with SQL-Style Timeline Formalization ( Poster ) > link | Jiaying Wu · Bryan Hooi 🔗 |
|
-
|
Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst ( Poster ) > link | Hongru WANG · Deng Cai · Wanjun Zhong · Shijue Huang · J Pan · Zeming Liu · Kam-Fai Wong 🔗 |
|
-
|
Spectral Journey: How Transformers Predict the Shortest Path ( Poster ) > link | Andrew Cohen · Andrey Gromov · Kaiyu Yang · Yuandong Tian 🔗 |
|
-
|
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models ( Poster ) > link | Shengkang Wang · Hongzhan Lin · Ziyang Luo · Zhen Ye · Guang Chen · Jing Ma 🔗 |
|
-
|
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models ( Poster ) > link | (Andrew) Zhanke Zhou · Xuan Li · Zhaocheng Zhu · Mikhail Galkin · Xiao Feng · Sanmi Koyejo · Jian Tang · Bo Han 🔗 |
|
-
|
Feedback-Aware Monte Carlo Tree Search for Efficient Information Seeking in Goal-Oriented Conversations ( Poster ) > link | Harshita Chopra · Chirag Shah 🔗 |
|
-
|
Think Outside the Bot: Automating Evaluation of Creativity in LLMs for Physical Reasoning with Semantic Entropy and Efficient Multi-Agent Judge ( Poster ) > link | Min Sen Tan · Zachary Choy · Swaagat Saikia · Syed Ali Redha Alsagoff · Banerjee Mohor · Nadya Wangsajaya · Alvin Chan 🔗 |
|
-
|
RuleArena: A Benchmark for LLM Rule-Guided Reasoning in Real-World Scenarios ( Poster ) > link | Ruiwen Zhou · Wenyue Hua · Liangming Pan · Sitao Cheng · Xiaobao Wu · En Yu · William Wang 🔗 |
|
-
|
Plan$^\ast$RAG: Efficient Test-Time Planning for Retrieval Augmented Generation ( Poster ) > link | Prakhar Verma · Sukruta Midigeshi · Gaurav Sinha · Arno Solin · Nagarajan Natarajan · Amit Sharma 🔗 |
|
-
|
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search ( Poster ) > link | Zongyu Lin · Yao Tang · Xingcheng Yao · Da Yin · Ziniu Hu · Yizhou Sun · Kai-Wei Chang 🔗 |
|
-
|
PDE-Controller: LLMs for Autoformalization and Reasoning of PDEs ( Poster ) > link | Mauricio Soroco · Jialin Song · Mengzhou Xia · Kye Emond · Weiran Sun · Wuyang Chen 🔗 |
|
-
|
Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving ( Poster ) > link | Sara Rajaee · Kumar Pratik Kumar Pratik · Gabriele Cesa · Arash Behboodi 🔗 |
|
-
|
MIR-Bench: Benchmarking LLM's Long-Context Intelligence via Many-Shot In-Context Inductive Reasoning ( Poster ) > link | Kai Yan · Zhan Ling · Kang Liu · Yifan Yang · Ting-Han Fan · Lingfeng Shen · Zhengyin Du · Jiecao Chen 🔗 |
|
-
|
WebWalker: Benchmarking LLMs in Web Traversal ( Poster ) > link |
11 presentersJialong Wu · Wenbiao Yin · Yong Jiang · Zhenglin Wang · Zekun Xi · Runnan Fang · Linhai Zhang · Yulan He · Deyu Zhou · Pengjun Xie · Fei Huang |
|
-
|
IGDA: INTERACTIVE GRAPH DISCOVERY THROUGH LARGE LANGUAGE MODEL AGENTS ( Poster ) > link | Alexander Havrilla · David Alvarez-Melis · Nicolo Fusi 🔗 |
|
-
|
When Debate Fails: Bias Reinforcement in Large Language Models ( Poster ) > link | Jihwan Oh · Minchan Jeong · Jongwoo Ko · Se-Young Yun 🔗 |
|
-
|
InductionBench: LLMs Fail in the Simplest Complexity Class ( Poster ) > link | Wenyue Hua · Fei Sun · Liangming Pan · Adam Jardine · William Wang 🔗 |
|
-
|
MMCode: Benchmarking Multimodal Large Language Models in Code Generation with Visually Rich Programming Problems ( Poster ) > link | Kaixin Li · Yuchen Tian · Qisheng Hu · Ziyang Luo · Zhiyong Huang · Jing Ma 🔗 |
|
-
|
Large Language Model-Enhanced Multi-Armed Bandits ( Poster ) > link | Jiahang Sun · Zhiyong Wang · Runhan Yang · Chenjun Xiao · John C.S. Lui · Zhongxiang Dai 🔗 |
|
-
|
Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures ( Poster ) > link | Fu-Chieh Chang · You-Chen Lin · Pei-Yuan Wu 🔗 |
|
-
|
FLEX-TRAVELPLANNER: A BENCHMARK FOR FLEXIBLE PLANNING WITH LANGUAGE AGENTS ( Poster ) > link | Juhyun Oh · Eunsu Kim · Alice Oh 🔗 |
|
-
|
Improving Test-Time Search for LLMs with Backtracking Against In-Context Value Verifiers ( Poster ) > link | Anikait Singh · Kushal Arora · Sedrick Keh · Jean Mercat · Tatsunori Hashimoto · Chelsea Finn · Aviral Kumar 🔗 |
|
-
|
Teaching Transformers Causal Reasoning through Axiomatic Training ( Poster ) > link | Aniket Vashishtha · Abhinav Kumar · Atharva Pandey · Abbavaram Gowtham Reddy · Kabir Ahuja · Vineeth Balasubramanian · Amit Sharma 🔗 |
|
-
|
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning ( Poster ) > link | Pan Lu · Bowen Chen · Sheng Liu · Rahul Thapa · Joseph Boen · James Y Zou 🔗 |
|
-
|
GRAPE: Generalizing Robot Policy via Preference Alignment ( Poster ) > link | Zijian Zhang · Kaiyuan Zheng · Zhaorun Chen · Joel Jang · Yi Li · Siwei Han · Chaoqi Wang · Mingyu Ding · Dieter Fox · Huaxiu Yao 🔗 |
|
-
|
A Simple Model of Inference Scaling Laws ( Poster ) > link | Noam Levi 🔗 |
|
-
|
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use ( Poster ) > link | Kaixin Li · Meng ziyang · Hongzhan Lin · Ziyang Luo · Yuchen Tian · Jing Ma · Zhiyong Huang · Tat-Seng Chua 🔗 |
|
-
|
Re-Imagine: Symbolic Benchmark Synthesis for Reasoning Evaluation ( Poster ) > link | Xinnuo Xu · Rachel Lawrence · Kshitij Dubey · Atharva Pandey · Fabian Falck · Risa Ueno · Aditya Nori · Rahul Sharma · Amit Sharma · Javier Hernandez 🔗 |
|
-
|
Large Language Models to Diffusion Finetuning ( Poster ) > link | Edoardo Cetin · Tianyu Zhao · Yujin Tang 🔗 |
|
-
|
Optimizing Test-Time Compute via Meta Reinforcement Finetuning ( Poster ) > link | Yuxiao Qu · Matthew Yang · Amrith Setlur · Lewis Tunstall · Edward Beeching · Russ Salakhutdinov · Aviral Kumar 🔗 |
|
-
|
When More is Less: Understanding Chain-of-Thought Length in LLMs ( Poster ) > link | Yuyang Wu · Yifei Wang · Tianqi Du · Stefanie Jegelka · Yisen Wang 🔗 |
|
-
|
FinMR: A Comprehensive Benchmark for Multimodal Financial Reasoning with Insights from Error Feedback Learning ( Poster ) > link | SHUANGYAN DENG · Haizhou Peng · Jiachen Xu · Chunhou Liu · Ciprian Doru Giurcaneanu · Jiamou Liu 🔗 |
|
-
|
Cutting Through the Noise: Boosting LLM Performance on Math Word Problems ( Poster ) > link | Ujjwala Anantheswaran · Himanshu Gupta · Kevin Scaria · Shreyas Verma · Chitta Baral · Swaroop Mishra 🔗 |
|
-
|
Decoupling the components of geometric understanding ( Poster ) > link | Eliza Kosoy · Annya Dahmani · Andrew Lampinen · Iulia Comsa · Soojin Jeong · Ishita Dasgupta · Kelsey Allen 🔗 |
|
-
|
Enhancing Mathematical Reasoning in Language Models Through Focused Differentiation Training ( Poster ) > link | Zhiyu Zhao · Yongcheng Zeng · Ning Yang · Zihan Zhao · Haifeng Zhang · Jun Wang · Guoqing Liu 🔗 |
|
-
|
MAS-GPT: Training LLMs To Build LLM-Based Multi-Agent Systems ( Poster ) > link | Rui Ye · Shuo Tang · Rui Ge · Yaxin Du · Zhenfei Yin · Jing Shao · Siheng Chen 🔗 |
|
-
|
ARIES: Stimulating Self-Refinement of Large Language Models with and for Iterative Preference Optimization ( Poster ) > link | Yongcheng Zeng · Xuanfa Jin · Guoqing Liu · Quan He · Dong Li · Jianye HAO · Haifeng Zhang · Jun Wang 🔗 |
|
-
|
Think to Ground: Improving Spatial Reasoning in LLMs for better Visual Grounding ( Poster ) > link | Karun Sharma · Vidushee Vats 🔗 |
|
-
|
LM2: Large Memory Models for Long Context Reasoning ( Poster ) > link | Jikun Kang · Wenqi Wu · Filippos Christianos · Alex Chan · Fraser Greenlee · George Thomas · Marvin Purtorab · Andrew Toulis 🔗 |
|
-
|
Offline Reinforcement Learning for LLM Multi-Step Reasoning ( Poster ) > link | Huaijie Wang · Shibo Hao · Hanze Dong · Shenao Zhang · Yilin Bao · Ziran Yang · Yi Wu 🔗 |
|
-
|
Can Large Language Models Reason? A Characterization via 3-SAT ( Poster ) > link | RISHI HAZRA · Gabriele Venturato · Pedro Zuidberg Dos Martires · Luc De Raedt 🔗 |
|
-
|
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization ( Poster ) > link | Yuchi Liu · Jaskirat Singh · Gaowen Liu · Ali Payani · Liang Zheng 🔗 |
|
-
|
Rationalization Models for Text-to-SQL ( Poster ) > link | Gaetano Rossiello · Nhan Pham · Michael Glass · Junkyu Lee · Dharmashankar Subramanian 🔗 |
|
-
|
PC-Agent: A Hierarchical Agentic Framework for Complex Task Automation on PC ( Poster ) > link |
11 presentersHaowei Liu · Xi Zhang · Haiyang Xu · Yuyang Wanyan · Junyang Wang · Ming Yan · Ji Zhang · Chunfeng Yuan · Changsheng Xu · Weiming Hu · Fei Huang |
|
-
|
Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration ( Poster ) > link | Qinglin Zhu · Runcong Zhao · Hanqi Yan · Yulan He · Yudong Chen · Lin Gui 🔗 |
|
-
|
Agentic Knowledgeable Self-awareness ( Poster ) > link |
11 presentersShuofei Qiao · Zhisong Qiu · Baochang Ren · Xiaobin Wang · Xiangyuan Ru · Ningyu Zhang · Xiang Chen · Yong Jiang · Pengjun Xie · Fei Huang · Huajun Chen |
|
-
|
Understanding Reasoning in Thinking Language Models via Steering Vectors ( Poster ) > link | Constantin Venhoff · Iván Arcuschin · Philip Torr · Arthur Conmy · Neel Nanda 🔗 |
|
-
|
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner ( Poster ) > link | Fu-Chieh Chang · Yu-Ting Lee · Hui-Ying Shih · Yi Tseng · Pei-Yuan Wu 🔗 |
|
-
|
LogitGaze: Predicting Human Attention Using Semantic Information from Vision-Language Models ( Poster ) > link | Dmitry Lvov · Ilya Pershin 🔗 |
|
-
|
Resolving Ambiguity through Personalization in LLM chat systems ( Poster ) > link | Sophia Sun · Abishek Sankararaman · Balakrishnan Narayanaswamy 🔗 |
|
-
|
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations ( Poster ) > link |
18 presentersKaixuan Huang · Jiacheng Guo · Zihao Li · Xiang Ji · Jiawei Ge · Wenzhe Li · Yingqing Guo · Tianle Cai · Hui Yuan · Runzhe Wang · Yue Wu · Ming Yin · Shange Tang · Yangsibo Huang · Chi Jin · Xinyun Chen · Chiyuan Zhang · Mengdi Wang |
|
-
|
UNDERSTANDING INFERENCE SCALING LAWS FOR MIXTURES OF LLMS ( Poster ) > link | Alexander Havrilla · Srishti Gureja 🔗 |
|
-
|
s1: Simple test-time scaling ( Poster ) > link | Niklas Muennighoff · Zitong Yang · Weijia Shi · XIANG LI · Li Fei-Fei · Hanna Hajishirzi · Luke Zettlemoyer · Percy Liang · Emmanuel Candes · Tatsunori Hashimoto 🔗 |
|
-
|
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? ( Poster ) > link | Che Liu 🔗 |
|
-
|
Chain-of-Thought Reasoning in the Wild is not Always Faithful ( Poster ) > link | Iván Arcuschin · Jett Janiak · Robert Krzyzanowski · Senthooran Rajamanoharan · Neel Nanda · Arthur Conmy 🔗 |
|
-
|
Training Large Language Models to Reason in a Continuous Latent Space ( Poster ) > link | Shibo Hao · Sainbayar Sukhbaatar · Andy (DiJia) Su · Xian Li · Zhiting Hu · Jason E Weston · Yuandong Tian 🔗 |
|
-
|
Refining Answer Distributions for Improved Large Language Model Reasoning ( Poster ) > link | Soumyasundar Pal · Didier Chételat · Yingxue Zhang · Mark Coates 🔗 |
|
-
|
Strategic LLM Decoding through Bayesian Games ( Poster ) > link | Weitong Zhang · Chengqi Zang · Bernhard Kainz 🔗 |
|
-
|
Meta-Prompt Optimization for LLM-Based Sequential Decision Making ( Poster ) > link | Mingze Kong · Zhiyong Wang · Yao Shu · Zhongxiang Dai 🔗 |
|
-
|
Disentangling Exploration of Large Language Models by Optimal Exploitation ( Poster ) > link | Tim Grams · Patrick Betz · Christian Bartelt 🔗 |
|
-
|
TRIG-Bench: A Benchmark for Text-Rich Image Grounding ( Poster ) > link | Ming Li · Ruiyi Zhang · Jian Chen · Tianyi Zhou 🔗 |
|
-
|
EcoAct: Economic Agent Determines When to Register What Action ( Poster ) > link |
11 presentersShaokun Zhang · Jieyu Zhang · Dujian Ding · Jiale Liu · Mirian Hipolito Garcia · Ankur Mallick · Daniel Madrigal · Menglin Xia · Victor Rühle · Qingyun Wu · Chi Wang |
|
-
|
Revealing chemical reasoning in LLMs through search on complex planning tasks ( Poster ) > link | Andres M Bran · Théo Neukomm · Daniel Armstrong · Zlatko Jončev · Philippe Schwaller 🔗 |
|
-
|
Language Models Use Trigonometry to Do Addition ( Poster ) > link | Subhash Kantamneni · Max Tegmark 🔗 |
|
-
|
MastermindEval: A Simple But Scalable Reasoning Benchmark ( Poster ) > link | Jonas Golde · Patrick Haller · Fabio Barth · Alan Akbik 🔗 |
|
-
|
Multi-Agent Verification: Scaling Test-Time Compute with Goal Verifiers ( Poster ) > link | Shalev Lifshitz · Sheila McIlraith · Yilun Du 🔗 |
|
-
|
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization ( Poster ) > link |
12 presentersZishun Yu · Tengyu Xu · Di Jin · Karthik Abinav Sankararaman · Yun He · Wenxuan Zhou · Zhouhao Zeng · Eryk Helenowski · Chen Zhu · Sinong Wang · Hao Ma · Han Fang |
|
-
|
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind ( Poster ) > link | Zhining Zhang · Chuanyang Jin · Mung Jia · Tianmin Shu 🔗 |
|
-
|
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance ( Poster ) > link | Yongchao Chen · Yilun Hao · Yueying Liu · Yang Zhang · Chuchu Fan 🔗 |
|
-
|
StochasTok: Improving Fine-Grained Subword Understanding in LLMs ( Poster ) > link | Anya Sims · Cong Lu · Klara Kaleb · Jakob Foerster · Yee Whye Teh 🔗 |
|
-
|
LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation ( Poster ) > link | Xuan Zhang · Fengzhuo Zhang · Cunxiao Du · Chao Du · Tianyu Pang · Wei Gao · Min Lin 🔗 |
|
-
|
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling ( Poster ) > link | Runze Liu · Junqi Gao · Jian Zhao · Kaiyan Zhang · Xiu Li · Biqing Qi · Wanli Ouyang · Bowen Zhou 🔗 |
|
-
|
MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems ( Poster ) > link | Anirudh Chari · Suraj Reddy · Aditya Tiwari · Richard Lian · Brian Zhou 🔗 |
|
-
|
Reasoning3D - Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models ( Poster ) > link | Tianrun Chen · Chunan Yu · Jing Li · Jianqi Zhang · Lanyun Zhu · Deyi Ji · Yong Zhang · Ying Zang · Lingyun Sun · Zejian Li 🔗 |
|
-
|
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage ( Poster ) > link | Zhi Gao · Bofei Zhang · Pengxiang Li · Xiaojian Ma · Tao Yuan · Yue Fan · Yuwei Wu · Yunde Jia · Song-Chun Zhu · Qing Li 🔗 |
|
-
|
Learning to Defer for Causal Discovery with Imperfect Experts ( Poster ) > link | Oscar Clivio · Divyat Mahajan · Perouz Taslakian · Sara Magliacane · Ioannis Mitliagkas · Valentina Zantedeschi · Alexandre Drouin 🔗 |
|
-
|
Multi-Turn Code Generation Through Single-Step Rewards ( Poster ) > link | Arnav Kumar Jain · Gonzalo Gonzalez-Pumariega · Wayne Chen · Alexander Rush · Wenting Zhao · Sanjiban Choudhury 🔗 |
|
-
|
LookPlanGraph: Embodied instruction following method with VLM graph augmentation ( Poster ) > link | Anatolii Onishchenko · Aleksey Kovalev · Aleksandr Panov 🔗 |
|
-
|
Diving into Self-Evolve Training for Multimodal Reasoning ( Poster ) > link | Wei Liu · Junlong Li · Xiwen Zhang · FAN ZHOU · Yu Cheng · Junxian He 🔗 |
|
-
|
On the Language of Thoughts in Large Language Models ( Poster ) > link | Chenxi Liu · Yongqiang Chen · Tongliang Liu · James Cheng · Bo Han · Kun Zhang 🔗 |
|
-
|
Scaling Flaws of Verifier-guided Search in Mathematical Reasoning ( Poster ) > link | Fei Yu · Yingru Li · Wang Benyou 🔗 |
|
-
|
DEDUCE: DEDUCTIVE CONSISTENCY AS A FRAMEWORK TO EVALUATE LLM REASONING ( Poster ) > link | Atharva Pandey · Kshitij Dubey · Rahul Sharma · Amit Sharma 🔗 |
|
-
|
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation ( Poster ) > link | Bo Pang · Hanze Dong · Jiacheng Xu · Silvio Savarese · Yingbo Zhou · Caiming Xiong 🔗 |
|
-
|
Rethinking Fine-tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning ( Poster ) > link | Feng Chen · Allan Raventos · Nan Cheng · Surya Ganguli · Shaul Druckmann 🔗 |
|
-
|
DetectiveQA: Evaluating Long-Context Reasoning on Detective Novels ( Poster ) > link |
11 presentersZhe Xu · Jiasheng Ye · Xiaoran Liu · Xiangyang Liu · Tianxiang Sun · Zhigeng Liu · Qipeng Guo · Linlin Li · Qun Liu · Xuanjing Huang · Xipeng Qiu |
|
-
|
TACO: Learning Multi-modal Models to Reason and Act with Synthetic Chains-of-Thought-and-Action ( Poster ) > link |
12 presentersZixian Ma · Jianguo Zhang · Zhiwei Liu · Jieyu Zhang · Juntao Tan · Manli Shu · Juan Carlos Niebles · Shelby Heinecke · Huan Wang · Caiming Xiong · Ranjay Krishna · silvio savarese |
|
-
|
Unveiling and Enhancing Multimodal In-context Learning of Large Vision-language Models ( Poster ) > link | Yanshu Li 🔗 |
|
-
|
Reinforcement Learning in Inference Time: A Perspective from Successive Policy Iterations ( Poster ) > link | Xinnan Zhang · Chenliang Li · Siliang Zeng · Jiaxiang Li · Zhongruo Wang · Songtao Lu · Alfredo Garcia · Mingyi Hong 🔗 |
|
-
|
Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning ( Poster ) > link | Chengsong Huang · Langlin Huang · Jiaxin Huang 🔗 |
|
-
|
Implicit Language Models are RNNs: Balancing Parallelization and Expressivity ( Poster ) > link | Mark Schoene · Babak Rahmani · Heiner Kremer · Fabian Falck · Hitesh Ballani · Jannes Gladrow 🔗 |
|
-
|
Clock and Calendar Understanding Challenges in Multimodal Large Language Models ( Poster ) > link | Rohit Saxena · Aryo Pradipta Gema · Pasquale Minervini 🔗 |
|
-
|
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction ( Poster ) > link | Yiheng Xu · Zekun Wang · Junli Wang · Dunjie Lu · Tianbao Xie · Amrita Saha · Doyen Sahoo · Tao Yu · Caiming Xiong 🔗 |
|
-
|
Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks ( Poster ) > link | Fangru Lin · Shaoguang Mao · Emanuele La Malfa · Valentin Hofmann · Adrian de Wynter · Xun Wang · Si-Qing Chen · Michael Wooldridge · Janet Pierrehumbert · Furu Wei 🔗 |
|
-
|
LLMs Aren't Good Strategists, Yet Can Accumulate Episodes for Improved Planning ( Poster ) > link | Yi Wu · Zhimin Hu 🔗 |
|
-
|
Value-Based Deep RL Scales Predictably ( Poster ) > link | Oleh Rybkin · Michal Nauman · Preston Fu · Charlie Snell · Pieter Abbeel · Sergey Levine · Aviral Kumar 🔗 |
|
-
|
Benchmarking Agentic Workflow Generation ( Poster ) > link | Shuofei Qiao · Runnan Fang · Zhisong Qiu · Xiaobin Wang · Ningyu Zhang · Yong Jiang · Pengjun Xie · Fei Huang · Huajun Chen 🔗 |
|
-
|
A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations ( Poster ) > link | Anian Ruoss · Fabio Pardo · Harris Chan · Bonnie Li · Volodymyr Mnih · Tim Genewein 🔗 |
|
-
|
ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification ( Poster ) > link | Hyunseok Lee · Seunghyuk Oh · Jihoon Tack · Jaehyung Kim · Jinwoo Shin 🔗 |
|
-
|
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners ( Poster ) > link | Daniele Paliotta · Junxiong Wang · Matteo Pagliardini · Kevin Li · Aviv Bick · Albert Gu · François Fleuret · Tri Dao 🔗 |
|
-
|
Keynote Talk: Junxian He (HKUST): Taming Reinforcement Learning for Effective and Efficient Reasoners
|
🔗 |