Poster
in
Workshop: Agents in the Wild: Safety, Security, and Beyond

Position: We Must Proactively Address AI Safety Debt

Peter Wallich ⋅ Raymond Douglas

Project Page [ OpenReview]

Abstract

This is a position paper. We argue that AI safety debt — the cost of closing the accumulated gaps between an AI system's actual safety approach and the approach it needs — is accumulating rapidly in frontier AI systems. In the race to unlock near-term capabilities, practitioners often implement safety interventions that do not scale to more advanced, less transparent models. The concept extends the established software-engineering notion of technical debt, but four structural properties make AI safety debt harder to manage: capabilities and contexts shift unpredictably, closing gaps may require solving open scientific problems, harms largely fall on third parties, and adversaries and AI systems may actively exploit gaps. Our position is that the AI community must explicitly track and manage this debt rather than continually deferring it. We propose the AI safety debt register, a practical approach using structured "debt cards" that connect safety claims, supporting evidence, and organisational decisions. We argue that this framework complements existing governance approaches by providing bottom-up aggregation of safety gaps, proactive assessment of how evidence degrades over time, and an improved treatment of uncertainty.

Chat is not available.