Attacking the Madry Defense Model with $L_1$-based Adversarial Examples

Workshop

Attacking the Madry Defense Model with $L_1$ -based Adversarial Examples

Yash Sharma · Pin-Yu Chen

East Meeting Level 8 + 15 #15

Tue 1 May, 4:30 p.m. PDT

[ Abstract ]

[ PDF]

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal

L_{\infty}

$L_\infty$ distortion

ϵ

$\epsilon$ = 0.3. This decision discourages the use of attacks which are not optimized on the

L_{\infty}

$L_\infty$ distortion metric. Our experimental results demonstrate that by relaxing the

L_{\infty}

$L_\infty$ constraint of the competition, the \textbf{e}lastic-net \textbf{a}ttack to \textbf{d}eep neural networks (EAD) can generate transferable adversarial examples which, despite their high average

L_{\infty}

$L_\infty$ distortion, have minimal visual distortion. These results call into question the use of

L_{\infty}

$L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.

Live content is unavailable. Log in and register to view live content

Workshop

Attacking the Madry Defense Model with L1L1L_1-based Adversarial Examples

Yash Sharma · Pin-Yu Chen

East Meeting Level 8 + 15 #15

Attacking the Madry Defense Model with $L_1$ -based Adversarial Examples