Poster
in
Workshop: 2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)
Information-theoretic Quantification of Inherent Discrimination Bias in Training Data for Supervised Learning
Sokrat Aldarmini · Mohamed Nafea
Algorithmic fairness research has primarily focused on adapting learning models to mitigate discrimination based on protected attributes, yet understanding inherent discrimination biases in training data remains largely unexplored. Given that data mining/engineering and model development are often conducted separately, quantifying these biases for potential downstream models is crucial for informed data engineering. We address this challenge by developing an information-theoretic framework to quantify the marginal impacts of dataset features on the discrimination bias of any downstream classifier. Our approach theoretically argues for measures aligning with specific desired properties and fairness notions. Specifically, we postulate a set of desired properties for candidate discrimination measures and derive measures that (partially) satisfy them. Distinct sets of these properties align with different fairness criteria like demographic parity or equalized odds, which we show can be in disagreement and not simultaneously satisfied by a single measure. We employ the Shapley value from cooperative game theory to determine individual features' marginal contributions to overall discrimination. We show the equivalence among some candidate measures under Shapley value aggregation and rigorously prove its effectiveness in eliminating redundancy. We conduct a comprehensive empirical ablation study on real-world and synthetic datasets to validate our measures' efficacy in capturing features' discriminatory impacts. For synthetic data generation, we use a parametric linear structural causal model and systematically examine distinct parameter settings corresponding to diverse data correlation structures, generating numerous datasets under these conditions to rigorously validate our theoretical framework. Overall, our analysis yields empirically validated guidelines for selecting discrimination measures based on data conditions and fairness criteria, establishing a robust framework for quantifying inherent discrimination bias in the data.