Quantifying Automation Risk in Financial AI: A Probabilistic Decomposition of Failure, Harm, and Severity
Abstract
Financial institutions increasingly deploy AI systems with varying levels of automation, yet lack principled methods for quantifying how automation affects harm propagation when failures occur. We propose a parsimonious risk decomposition expressing expected loss as: E[Loss] = P(F) × P(H | F, A) × E[S | H], where F denotes system failure, H denotes harm, S denotes severity, and A ∈ [0,1] represents the automation level. This framework isolates a critical but underexamined quantity: the conditional probability that failures propagate into harm given the level of automation, P(H | F, A), which captures execution and oversight risk rather than model accuracy alone. We derive key theoretical properties, including automation risk elasticity and implications for allocating resources between model validation and deployment controls. We conduct an exploratory empirical analysis of 203 finance-specific AI incidents (1998–2025) from the AI Incident Database, using transparent, reproducible keyword-based coding. High-automation systems exhibit a 5.76× higher conditional probability of harm than human-supervised systems (Fisher’s exact test, p = 0.015; bootstrap 95% CI: [1.53, 8.37]). Extensive sensitivity analyses—including alternative automation definitions, domain exclusions, and partial identification bounds—demonstrate robustness while quantifying uncertainty. A calibrated case study of the 2012 Knight Capital incident demonstrates practical application, showing ROI for oversight investments ranging 0.8–11.7× across plausible parameter ranges. We position this work as foundational quantification providing actionable guidance for resource allocation between model validation and deployment controls in agentic financial AI systems.