Poster
in
Workshop: The Future of Machine Learning Data Practices and Repositories
Data Curation for Pluralistic Alignment
Dalia Ali · Aysenur Kocak · Michèle Wieland · Dora Zhao · Allison Koenecke · Orestis Papakyriakopoulos
Human feedback datasets are central to AI alignment, yet the current data collection methodsdo not necessarily capture diverse and complex human values. For example, existing alignmentdatasets focus on ”Harmfulness” and ”Helpfulness”, but dataset curation should expand toencompass a broader range of human values. In this paper, we introduce a pluralistic alignmentdataset that (i) integrates the dimensions of ”Toxicity”, ”Emotional Awareness”, ”Sensitivityand Openness”, ”Helpfulness”, and ”Stereotypical Bias”, (ii) reveals undiscovered tensions inhuman ratings on AI-generated content, (iii) shows how demographics and political ideologiesshape human preferences in alignment datasets, and (iv) highlights issues in data collection andmodel fine-tuning. Through a large-scale human evaluation study (N=1,095, across the U.S. andGermany), we identify key challenges in data curation for pluralistic alignment, including thecoexistence of conflicting values in human ratings, demographic imbalances, and limitations inreward models and cost functions that prohibit them from dealing with the diversity of values inthe datasets. Based on these findings, we develop a series of considerations that researchers andpractitioners should consider to achieve inclusive AI models.