Poster
in
Workshop: Principled Design for Trustworthy AI: Interpretability, Robustness, and Safety Across Modalities

Unifying Perspectives on Learning Biases: A Data-Centric Intervention for Holistic Fairness, Robustness, and Generalization

Patrick Vincent ⋅ Innocent Nyalala

Project Page

Abstract

Bias in machine learning manifests as a triple pathology: fragility in robustness ($\mathbf{R}$), inequity in fairness ($\mathbf{F}$), and failure in out-of-distribution generalization ($\mathbf{G}$). Although typically addressed as distinct challenges, we argue that these failures often share a common driver: reliance on non-causal spurious correlations. We propose Dynamic Data Re-weighting (DDR), a data-centric intervention. By leveraging adversarial introspection, DDR identifies and up-weights “hard” samples, where model predictions are brittle and shortcut-dependent, without requiring spurious group annotations during training. Experiments across four bias settings–background bias (waterbirds), facial attribute bias (CelebA), synthetic domain shift (Corrupted MNIST), and natural class imbalance–demonstrate that this single intervention improves robustness, fairness, and OOD generalization. DDR achieves substantial gains, highlighted by a 76.76 pp OOD accuracy gain on Corrupted MNIST and a 13.85 pp increase in Worst-Group Accuracy on Waterbirds using a standardized ResNet50 backbone. Furthermore, the method reduces the correlation between learned features and spurious attributes (FSC), consistent with increased reliance on more invariant features in model decision-making.