Aligned but Stereotypical? Understanding and Mitigating Social Bias in LLM-Driven Text-to-Image Models
Abstract
LLM-based text-to-image (T2I) systems improve prompt understanding, but their effect on demographic bias remains under-explored. In this paper, we find that recent LLM-based T2I models produce more demographically biased images than non-LLM baselines. To study this behavior, we introduce SocBiasBench, a 1,024-prompt benchmark spanning four levels of prompt complexity. Using decoded-text analysis, token-probability probes, and embedding-space analysis, we find that system-prompt conditioning is an important pathway through which demographic priors affect image generation. To this end, we propose FairPro, a training-free test-time method that uses the embedded LLM to construct an input-dependent system prompt that mitigates stereotypical demographic completions while preserving user intent. Across recent LLM-based T2I models, FairPro reduces demographic bias while preserving text-image alignment, suggesting that system prompts are a practical intervention point for fairer T2I generation.