Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Secure and Trustworthy Large Language Models

Group Preference Optimization: Few-Shot Alignment of Large Language Models

Siyan Zhao · John Dang · Aditya Grover


Abstract:

Applications of large language models (LLMs) often demand nuanced judgments that vary among different groups. Existing alignment algorithms can be costly, requiring extensive group-specific data and computation. We present Group Preference Optimization (GPO), a framework that efficiently aligns LLMs to group preferences using a few-shot approach. In GPO, we augment the base LLM with an independent transformer module to predict the preferences of a group for the LLM generations. For few-shot learning, this module acts as an in-context autoregressive transformer and is trained via meta-learning on several groups. Through empirical validation on opinion adaptation tasks involving US demographic groups, global countries, and individuals, GPO demonstrates superior alignment performance, requiring fewer group-specific preferences and reduced training and computational resources, surpassing existing strategies like in-context steering and fine-tuning.

Chat is not available.