Poster
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
Mingkun Zhang · Keping Bi · Wei Chen · Jiafeng Guo · Xueqi Cheng
Hall 3 + Hall 2B #576
In this paper, we aim to build an adversarially robust zero-shot image classifier that can accurately and efficiently classify unseen examples while defending against unforeseen adversarial attacks, addressing critical challenges in real-world safety-sensitive scenarios. To achieve this, we focus on two key challenges: zero-shot classification and defense against unforeseen attacks. We ground our work on CLIP, a vision-language pre-trained model to perform zero-shot classification. To defend against unforeseen attacks, we adopt a purification approach, as it is independent of specific attack types. We then define a purification risk as the KL divergence between the joint distributions of the purification and attack process. The derived lower bound of purification risk inspires us to explore purification in CLIP's multi-modal latent space. We propose a CLIP-based purification method called CLIPure, which has two variants: CLIPure-Diff, which models image likelihood with a generative process of its latent vector, and CLIPure-Cos, which models the likelihood based on the similarity between embeddings of the image and a blank template. As far as we know, CLIPure is the first purification method in latent space and CLIPure-Cos is the first purification method not relying on generative models, substantially improving defense efficiency. Extensive experimental results show that the robustness achieved by CLIPure is within a small gap of clean accuracy, outperforming SOTA robustness by a large margin, e.g., from 71.7\% to 91.1\% on CIFAR10, from 59.6\% to 72.6\% on ImageNet, and 108\% relative improvements of average robustness on the 13 datasets over previous SOTA, with only 14\% extra inference cost and no additional training.
Live content is unavailable. Log in and register to view live content