Poster
in
Workshop: Workshop on Large Language Models for Agents

The ART of LLM Refinement: Ask, Refine, Trust

Kumar Shridhar ⋅ Koustuv Sinha ⋅ Andrew Cohen ⋅ Tianlu Wang ⋅ Ping Yu ⋅ Ramakanth Pasunuru ⋅ Mrinmaya Sachan ⋅ Jason E Weston ⋅ Asli Celikyilmaz

Project Page [ OpenReview]

Abstract

Large Language Models (LLMs) have demonstrated remarkable generative abilities, but can they judge the quality of their generations and self-improve?A popular concept, referred to as self-refinement, postulates that LLMs can detect and correct the errors in their generations when asked to do so. However, recent empirical evidence points in the opposite direction, suggesting that LLMs often struggle to accurately identify errors when reasoning is involved. To address this, we propose a reasoning with a refinement strategy called ART, which asks necessary questions to decide when an LLM should refine its output, and uses it to affirm or deny trust in its refinement by ranking the refinement and the initial prediction. On two multistep reasoning tasks of mathematical word problems (GSM8K) and question answering (StrategyQA), ART achieves a performance gain of +5 points over self-refinement baselines, while using a much smaller model as the decision maker. We believe that ART with smaller models, making refinement decisions can be a cost-effective alternative to fine-tuning LLMs.

Chat is not available.