Expo Talk Panel Thu, Apr 23, 2026 • 8:00 AM – 9:00 AM PDT 205

Ant Group: Scaling Hybrid Linear Attention Architecture to Trillion-Scale

Abstract

In this talk, we present our experience in scaling hybrid linear attention architectures to trillion-scale, through two models from the Ling Team: Ling-2.5-1T and Ring-2.5-1T. These models integrate linear attention with selected softmax attention layers to support efficient long-context training while preserving strong reasoning and representation capability. We share key algorithm–system co-design insights that make trillion-scale hybrid attention practical, including stability techniques for large-scale linear attention training and efficient distributed training for ultra-long sequences.

Video

Chat is not available.