Poster
in
Workshop: AI for Nucleic Acids (AI4NA)
EvoFlow-RNA: Generating and Representing non-coding RNA with a Language Model
Sawan Patel · Zhangzhi Peng · Keith Fraser · Adam Friedman · Pranam Chatterjee · Sherwood Yao
RNA plays a critical role across numerous biological functions. Recent advances in language modeling show promise with representing RNA, but the possibility of large-scale RNA design and optimization has yet to be explored. We propose \textbf{EvoFlow-RNA}, a bidirectional RNA language model leveraging masked discrete diffusion models (MDMs) for both generative modeling and representation learning. EvoFlow-RNA bridges the gap between RNA sequence representation and design. It outperforms leading RNA models on six BEACON tasks, excelling in secondary structure prediction. For unconditional generation, it synthesizes diverse RNA sequences with native-like biophysical properties. Furthermore, EvoFlow-RNA can optimize aptamer sequences while preserving binding recognition sites. Our results demonstrate EvoFlow-RNA’s effectiveness in RNA modeling, highlighting the capability and potential of masked discrete diffusion for RNA design.