ML-GUIDED MINING OF AN EXTENSIVELY VALIDATED SCFV LIBRARY FOR OPEN-SOURCE ENZYMES IN DIAGNOSTICS
Abstract
Cost and IP barriers limit access to high-performance enzyme start/stop modifiers such as hot-start systems that suppress premature activity during reaction setup. We combine an accessible 10^10 human scFv phage-display library with activity-linked screening to engineer open, recombinant enzyme regulators. Using low-cost fluorescence workflows (in-house dye synthesis with 67-86× cost reduction), we identify scFv inhibitors that convert standard polymerases into hot-start formulations. In head-to-head benchmarking against commercial hot-start enzymes, scFv-regulated polymerases achieve commercial-grade suppression during setup with heat-triggered recovery and robust amplification. To scale beyond individual hits, we outline a data-centric pipeline: NGS-tracked selections yielding more than 10^4 binder/non-binder sequences per target and deep mutational scanning of lead scFvs (5,000-20,000 variants) to map CDR-level inhibitory fitness landscapes for predictive design. We highlight prospective extensions to ligases, restriction enzymes, and CRISPR-Cas systems.