Beyond Motif Localization: Probing Rule-Level Signals in Synthetic Genomic Grammars
Abstract
Attribution methods are standard tools for interpreting deep learning models in regulatory genomics, but evaluations typically focus on whether motif bases receive high importance scores. We ask whether attribution maps also capture compositional rules such as motif ordering, spacing, and logical interactions. Using synthetic DNA datasets with known ground-truth grammars, we evaluate five attribution methods on localization accuracy and rule-level consistency. For the latter, we introduce the Grammar Satisfiability Score (GSS), a metric that checks whether signed attributions satisfy the Boolean logic of the generating grammar. We find that strong motif localization coexists with poor logical faithfulness for conjunctive and context-dependent grammars, and that saliency structure persists under progressive parameter randomization.