Poster
Deep Kernel Relative Test for Machine-generated Text Detection
Yiliao Song · Zhenqiao Yuan · Shuhai Zhang · Zhen Fang · Jun Yu · Feng Liu
Hall 3 + Hall 2B #299
Recent studies demonstrate that two-sample test can effectively detect machine-generated texts (MGTs) with excellent adaptation ability to texts generated by newer LLMs. However, two-sample test-based detection relies on the assumption that human-written texts (HWTs) must follow the distribution of seen HWTs. As a result, it tends to make mistakes in identifying HWTs that deviate from the seen HWT distribution, limiting their use in sensitive areas like academic integrity verification. To address this issue, we propose to employ non-parametric kernel relative test to detect MGTs by testing whether it is statistically significant that the distribution of a text to be tested is closer to the distribution of HWTs than to the MGTs' distribution. We further develop a kernel optimisation algorithm in relative test to select the best kernel that can enhance the testing capability for MGT detection. As relative test does not assume that a text to be tested must belong exclusively to either MGTs or HWTs, relative test can largely reduce the false positive error compared to two-sample test, offering significant advantages in practice. Extensive experiments demonstrate the superior performance of our method, compared to state-of-the-art non-parametric and parametric detectors. The code and demo are available: https://github.com/xLearn-AU/R-Detect.
Live content is unavailable. Log in and register to view live content