ICLR Poster Reasoning Elicitation in Language Models via Counterfactual Feedback

Poster

Reasoning Elicitation in Language Models via Counterfactual Feedback

Alihan Hüyük · Xinnuo Xu · Jacqueline Maasch · Aditya Nori · Javier Hernandez

Hall 3 + Hall 2B #257

[ Abstract ]

Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Oral presentation: Oral Session 6A
Sat 26 Apr 12:30 a.m. PDT — 2 a.m. PDT

Abstract:

Despite the increasing effectiveness of language models, their reasoning capabilities remain underdeveloped. In particular, causal reasoning through counterfactual question answering is lacking. This work aims to bridge this gap. We first derive novel metrics that balance accuracy in factual and counterfactual questions, capturing a more complete view of the reasoning abilities of language models than traditional factual-only based metrics. Second, we propose several fine-tuning approaches that aim to elicit better reasoning mechanisms, in the sense of the proposed metrics. Finally, we evaluate the performance of the fine-tuned language models in a variety of realistic scenarios. In particular, we investigate to what extent our fine-tuning approaches systemically achieve better generalization with respect to the base models in several problems that require, among others, inductive and deductive reasoning capabilities.

Live content is unavailable. Log in and register to view live content