ICLR Poster Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Poster

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal · Christina Baek · Zico Kolter · Aditi Raghunathan

Hall 3 + Hall 2B #595

[ Abstract ]

Fri 25 Apr midnight PDT — 2:30 a.m. PDT

Oral presentation: Oral Session 3A
Thu 24 Apr 7:30 p.m. PDT — 9 p.m. PDT

Abstract: Large Language Model's are instruction-finetuned to enhance their ability to follow user instructions and better comprehend input context. Still, they often struggle to follow the input context, especially when it contradicts model's parametric knowledge. This manifests as various failures, such as hallucinations where a model inserts outdated or unwarranted facts into its response. In this work, we observe an intriguing phenomenon: the context reliance of the model decreases as instruction finetuning progresses,

$\textit{despite an initial expected increase}$ . We call this phenomenon as the

$\textbf{context-parametric inversion}$ . This is surprising, as one would expect instruction tuning to improve the model's ability to follow input instructions. We observe this behavior on multiple general purpose instruction tuning datasets such as TULU, Alpaca and Ultrachat, across multiple model families like Llama, Mistral and Pythia. We perform various controlled studies to eliminate some simple hypothesis for this observed behavior and isolate what datapoints cause this counter-intuitive behavior. We then analyze the phenomenon theoretically, to explain why context reliance varies across the trajectory of finetuning. We tie the observed context-parametric inversion to the properties of the finetuning data, which provides us with some potential mitigation strategies that provide limited but insightful gains.

Live content is unavailable. Log in and register to view live content