ICLR Poster Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Poster

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Gabriele Sarti · Grzegorz Chrupała · Malvina Nissim · Arianna Bisazza

Halle B #261

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract: Establishing whether language models can use contextual information in a human-plausible way is important to ensure their safe adoption in real-world settings. However, the questions of

when

$\textit{when}$ and

which parts

$\textit{which parts}$ of the context affect model generations are typically tackled separately, and current plausibility evaluations are practically limited to a handful of artificial benchmarks. To address this, we introduce

P

$\textbf{P}$ lausibility

E

$\textbf{E}$ valuation of

Co

$\textbf{Co}$ ntext

Re

$\textbf{Re}$ liance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use PECoRe to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated model translations to identify context-mediated predictions and highlight instances of (im)plausible context usage throughout generation.

Chat is not available.