Email in the Era of LLMs
Abstract
Email communication increasingly involves large language models (LLMs), but we lack intuition on how they will read, write, and optimize for nuanced social goals. We introduce HR Simulator, a game where communication is the core mechanic: players act as a Human Resources officer and write emails to resolve socially challenging workplace scenarios. An analysis of over 600 human and LLM emails with LLMs-as-judge reveals evidence for larger LLMs becoming more homogeneous in their email quality judgments, suggesting an emerging set of shared LLM norms and values. LLM-only emails outperform human emails under LLM judges (e.g., 23.5% vs. 48--54% success rate), but rewriting human drafts with models reliably improves over human-only and can sometimes beat LLM-only (e.g., from 40% to nearly 100% in one scenario). Rewrites make human emails more formal and empathetic, which likely contributes to the hybrid advantage. Our results demonstrate the efficacy of communication games as instruments to measure communication in the era of LLMs, and posit human--LLM co-writing as the most effective form of communication in that future.