Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
Deep Learning and Direct Sequencing of Labeled RNA Captures Transcriptome Dynamics
Vlastimil Martinek · Jessica Martin · Cedric Belair · Matthew Payea · Sulochan Malla · Panagiotis Alexiou · Manolis Maragkakis
In eukaryotes, genes produce a variety of distinct RNA isoforms, each with unique regulatory signals and resulting protein products. Assessing the metabolism of RNA isoforms is essential for unraveling gene regulatory mechanisms. However, this is impeded by current methods reliant on short-read sequencing, which are inadequate for differentiating between individual isoforms. Additionally, these methods cannot concurrently analyze RNA isoform metabolism and key regulatory elements of RNA stability, such as poly(A) tail and nucleotide modifications. Here, we metabolically label nascent RNA with 5-ethynyl uridine modi- fication and employ direct RNA nanopore sequencing. We introduce RNAkinet, a deep convolutional and recurrent neural network, tailored to directly process electrical signals from nanopore sequencing for the detection of modified RNA molecules. RNAkinet effectively distinguishes between nascent and pre-existing RNA molecules and is generalizable to various cell types and organisms. By modeling RNA decay rates, RNAkinet allows reproducible identification of the kinetic parameters of individual RNA isoforms and facilitates efficient, integrated studies of RNA isoform metabolism and the regulatory elements that influence it.