Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models
MathSensei: Mathematical Reasoning with a Tool-Augmented Large Language Model
DEBRUP DAS · Debopriyo Banerjee · Somak Aditya · Ashish Kulkarni
Tools or modules, adept in solving specific tasks, when augmented with Large Language Models (LLMs), commonly termed as TALMs, show superior reasoning abilities over vanilla LLMs, across different knowledge intensive Question Answering (QA) tasks. However, their efficacy on complex mathematical reasoning benchmarks, has remained largely unexplored. Moreover, existing research lacks the study of complementary benefits offered by diverse tool-sets towards solving mathematical problems. In this work, we present a TALM-based framework~-~MathSensei, which is powered by a knowledge retriever (LLM or Bing Web Search), program generator + executor (Python), and symbolic problem solver (Wolfram-Alpha). We perform extensive ablations with various tool combinations, across multiple math sub-disciplines of different datasets. Our experiments also comprise evaluation of well-known planning algorithms such as REACT and Plan-And-Solve. MathSensei outperforms gpt-3.5-turbo with chain-of-thought (CoT) by 13.5% on the MATH dataset. We observe that TALMs are beneficial for progressively increasing complexity of problems (such as AQuA, MMLU-Math, and higher level complex questions in MATH), and show minimal benefits over simpler math word problems (such as GSM-8k).