Training individually fair ML models with sensitive subspace robustness

Mikhail Yurochkin, Amanda Bower, Yuekai Sun

Keywords: adversarial, fairness, optimization, perturbation, robustness

Thursday: Fairness, Interpretabiity and Deployment

Abstract: We consider training machine learning models that are fair in the sense that their performance is invariant under certain sensitive perturbations to the inputs. For example, the performance of a resume screening system should be invariant under changes to the gender and/or ethnicity of the applicant. We formalize this notion of algorithmic fairness as a variant of individual fairness and develop a distributionally robust optimization approach to enforce it during training. We also demonstrate the effectiveness of the approach on two ML tasks that are susceptible to gender and racial biases.

