mRNA-GPT: End-to-end Generative Design and Optimization of Full-length mRNA
Abstract
We introduce mRNA-GPT, a generative model for end-to-end full-length mRNA sequence design and optimization. Unlike existing approaches that optimize isolated regions, mRNA-GPT jointly optimizes across all three regions (5′ UTR, CDS, and 3′ UTR) to capture cross-region regulatory interactions critical for therapeutic efficacy. Pre-trained on 10 million full-length natural mRNA sequences across diverse species and organisms, establishing a robust foundation for sequence generation. mRNA-GPT employs an iterative optimization framework with oracle-based rewards to progressively enhance target properties including translation efficiency and half-life. mRNA-GPT supports flexible generation modes: single regions, full-length sequences, or conditional generation given any region. Empirical results demonstrate superior performance over state-of-the-art methods: CDS optimization achieves higher predicted translation rates than LinearDesign and GEMORNA while maintaining structural stability, and full-length design captures critical cross-region interactions yielding enhanced translation efficiency. This unified approach establishes mRNA-GPT as a versatile platform for rational mRNA therapeutics design.