Poster
in
Workshop: AI for Peace

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Francesco Ortu ⋅ Joeun Yook ⋅ Punya Syon Pandey ⋅ Keenan Samway ⋅ Bernhard Schölkopf ⋅ Alberto Cazzaniga ⋅ Rada Mihalcea ⋅ Zhijing Jin

Project Page [ OpenReview]

Abstract

Large language models (LLMs) are increasingly consulted for historical information by citizens, journalists, and institutions, raising concerns about their tendency to reproduce or amplify historical revisionism: the distortion, omission, or reframing of established facts. We introduce \texttt{HistoricalMisinfo}, a curated dataset of $500$ contested events from $45$ countries, each paired with factual and revisionist narratives. To approximate real-world dissemination, we design $11$ prompt scenarios per event, capturing diverse ways historical content is elicited and framed. Using this benchmark, we evaluate multiple medium-sized LLMs and find systematic vulnerabilities: the prevalence of revisionist outputs varies across models, countries, and prompt types. \texttt{HistoricalMisinfo} provides a practical foundation for auditing the reliability of generative systems and for developing safeguards against the spread of revisionist narratives.

Chat is not available.