Skip to yearly menu bar Skip to main content


Poster
in
Workshop: First Workshop on Representational Alignment (Re-Align)

Humans diverge from language models when predicting spoken language

Thomas Botch · Emily Finn

Keywords: [ brain imaging ] [ language ] [ large language models ] [ Behavioral Alignment ] [ prediction ] [ functional MRI ]


Abstract:

Humans communicate through both spoken and written language, often switching between these modalities depending on their goals. The recent success of large language models (LLMs) has driven researchers to understand the extent to which these models align with human behavior and neural representations of language. While prior work has shown similarities in how humans and LLMs form predictions of written text, no work has investigated whether LLMs are representative of human predictions of spoken language. We investigated the alignment between LLMs and behavior of human participants (N=300) who predicted words within a story presented as either spoken language or written text. We found that LLM predictions were more similar to humans' predictions of written text compared to spoken language, though humans' predictions of spoken language were the most accurate. Then, by training encoding models to predict neural activity recorded with fMRI to the same auditory story, we showed that models based on human predictions of spoken language better aligned with observed brain activity during listening compared to models based on LLM predictions. These findings suggest that the structure of spoken language carries additional information relevant to human behavior and neural representations.

Chat is not available.