Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models
Francesco Ortu ⋅ Joeun Yook ⋅ Punya Syon Pandey ⋅ Keenan Samway ⋅ Bernhard Schölkopf ⋅ Alberto Cazzaniga ⋅ Rada Mihalcea ⋅ Zhijing Jin
Abstract
Large language models (LLMs) are increasingly consulted for historical information by citizens, journalists, and institutions, raising concerns about their tendency to reproduce or amplify historical revisionism: the distortion, omission, or reframing of established facts. We introduce \texttt{HistoricalMisinfo}, a curated dataset of $500$ contested events from $45$ countries, each paired with factual and revisionist narratives. To approximate real-world dissemination, we design $11$ prompt scenarios per event, capturing diverse ways historical content is elicited and framed. Using this benchmark, we evaluate multiple medium-sized LLMs and find systematic vulnerabilities: the prevalence of revisionist outputs varies across models, countries, and prompt types. \texttt{HistoricalMisinfo} provides a practical foundation for auditing the reliability of generative systems and for developing safeguards against the spread of revisionist narratives.
Chat is not available.
Successful Page Load