Invited Talk
in
Workshop: ICLR 2025 Workshop on Human-AI Coevolution
AI Homogenization and Human Variation
AJ Alvero
When large language models (LLMs) generate "human-like text", which humans do they write like? The answer to this question has strong implications for their use as methodological tools across the social sciences, but they also point to new dynamics to consider in important social processes (such as evaluation). In this presentation, I will discuss published and ongoing work that compares the text produced by LLMs with humans. Most of the work in this talk uses selective college admissions as a test site to examine these dynamics, but I will also show how they hold for open-ended survey questions and even non-textual media like generative AI produced images. Collectively, the studies will demonstrate the complications to the driving question of which humans LLMs write like. By popular stylistic measures, the answer is people from high social status communities; but by other measures, the answer is blurred by patterns of AI homogenization. These results helped us formulate two perspectives for future research as it relates to comparisons with humans but also for the overarching objectives of AI sociolinguistic design: the sampling perspective, based on linguistic closeness; and the distributional perspective, based on variation in text features. We hope this work can help spur not just more empirical work but also theoretical work about human language as it compares and interacts with LLMs.