poster
in
Affinity Workshop: Tiny Papers Poster Session 6
Can Speculative Sampling Accelerate ReAct Without Compromising Reasoning Quality?
Han Xu · Haipeng Chen
#295
Large language models (LLMs) are increasingly used as agents for interaction with external environments. These interplays are commonly facilitated through various prompting paradigms. However, such paradigms require extended interaction traces between the LLMs and the environment, resulting in low task-solving efficiency. In this work, we integrate speculative sampling (SpS) into the novel ReAct paradigm. In particular, we investigate speculative sampling’s impact on the efficiency of ReAct and the quality of reasoning tasks. Our evaluations using HotPotQA and FEVER datasets demonstrate that implementing speculative sampling alongside ReAct results in a 2.18x-2.62x acceleration compared to using ReAct alone, while only introducing a negligible impact on the reasoning abilities.