CROSS-LINGUAL FAIRNESS DRIFT IN LLM MORAL REASONING
Abstract
As large language models (LLMs) are deployed across linguistically and culturally diverse populations, their ethical reasoning must remain robust across demographic subgroups, a prerequisite for fairness under the distributional shifts that deployed systems encounter as user populations evolve. We present a framework for detect- ing and quantifying cross-population behavioral disparity in LLM moral reasoning across four languages (English, Spanish, Korean, and Mandarin), each representing a distinct cultural subpopulation. Using a seven-pillar evaluation rubric spanning deontological, consequentialist, and virtue-ethical reasoning alongside coherence, context sensitivity, moral uncertainty (MUI), and cultural grounding (CGRI), we evaluate five LLMs on 50 moral dilemmas. Our results reveal subgroup robustness failures: consequentialist bias amplifies in non-English contexts (mean disparity ∆ = +0.11), cultural grounding collapses by up to 88% across languages, and behavioral consistency varies by model. We introduce disparity metrics that quan- tify behavioral instability across populations and show that current LLMs fail to maintain equitable ethical reasoning when serving linguistically diverse subgroups. These findings establish language as a critical axis for fairness auditing and as a leading indicator of behavioral drift risk in deployed moral reasoning systems.