Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
A mechanistically interpretable neural-network architecture for discovery of regulatory genomics
Alex M Tseng · Gökcen Eraslan · Nathaniel Diamant · Tommaso Biancalani · Gabriele Scalia
Deep neural networks have shown unparalleled success in mapping genomic DNA sequences to associated readouts such as protein–DNA binding. Beyond prediction, the goal of these networks is to then learn the underlying motifs (and their syntax) which drive genome regulation. Traditionally, this has been done by applying fragile and computationally expensive post-hoc analysis pipelines to trained models. Instead, we propose an entirely alternative method for learning motif biology from neural networks. We designed a mechanistically interpretable neural-network architecture for regulatory genomics, where motifs and their syntax are directly encoded and readable from the learned weights and activations, thus eliminating the need for post-hoc pipelines. Our model is also more robust to variable sequence contexts and against adversarial attacks, while attaining predictive performance comparable to its traditional counterparts.