Alignment Propagation: From One Agent to Many, From Games to Worlds
Asuka Zheng ⋅ Nicole Hsing ⋅ Yi Zhao ⋅ Haoqin Tu ⋅ Jen-Tse Huang
Abstract
Multi-agent systems require robust alignment, but aligning every agent individually does not scale to open environments with many interacting models. We propose \textbf{Alignment Propagation}, where cooperative behavior is instilled in a single fine-tuned ``seed'' agent and spreads to untrained agents through interaction. To study this effect, we introduce the \textbf{Alignment Propagation Playground} with two complementary settings: (i) \textbf{Red-Black Game}, a discrete social dilemma with \textbf{broadcast} deliberation, and (ii) \textbf{Sugarscape}, a continuous resource-competition world with \textbf{pairwise} negotiation. We use a frontier model to generate cooperative Red-Black trajectories and fine-tune a seed agent, then deploy seeds into otherwise untrained collectives. A single seed more than doubles cooperation on held-out Red-Black scenarios (26\% $\rightarrow$ 62\%), scaling to 96\% with five seeds. Without retraining, seeds transfer zero-shot to Sugarscape (91.5\% trade success vs.\ 21.6\% for an untrained baseline) and outperform prompt-based Gemini 3 Pro. Finally, we find topology governs propagation efficiency: broadcast deliberation requires 20\% seeds to shift the group, whereas pairwise negotiation requires $\sim$50\%.
Successful Page Load