D&R: Recovery-based AI-Generated Text Detection via a Single Black-box LLM Call
Abstract
Large language models (LLMs) generate increasingly human-like text, raising concerns about misinformation and authenticity. Detecting AI-generated text remains challenging: existing methods often underperform, especially on short texts, require probability access unavailable in real-world black-box settings, incur high costs from multiple calls, or fail to generalize across models. We propose Disrupt-and-Recover (D\&R), a recovery-based detection framework grounded in posterior concentration. D\&R disrupts text via model-free Within-Chunk Shuffling, performs a single black-box LLM recovery, and measures semantic–structural recovery similarity as a proxy for concentration. This design ensures efficiency, black-box practicality, and is theoretically supported under the concentration assumption. Extensive experiments across four datasets and six source models show that D\&R achieves state-of-the-art performance, with AUROC 0.96 on long texts and 0.87 on short texts, surpassing the strongest baseline by +0.08 and +0.14. D\&R further remains robust under source–recovery mismatch and model variation. Our code and data is available at https://anonymous.4open.science/r/1MAdaWTy0xaod5qR.