Code for the paper Generalizable Data-free Objective for Crafting Universal Adversarial Perturbations by Mopuri et al., 2018.
This repository depends on PyTorch, but you can refer to the original repository if you prefer TensorFlow. This implementation was developed by Pedro Sandoval-Segura.
A universal adversarial perturbation (UAP) is an image-agnostic perturbation vector that, when added to any image, leads a classifier to change its classification of that image. In the example below, VGG-16 classifies the clean image on the left as a 'piggy bank'. But when a UAP (under an imperceptibility constraint) is added to the image, VGG-16 misclassifies the image as a 'theater curtain'.
The main algorithm for optimizing a UAP is in gduap.py
. The range prior and the data prior, described in the original paper, are not implemented here.
- Install dependencies listed in
requirements.txt
. Note that not all the dependencies are required, but the main modules are torch 1.6.0 and torchvision 0.7.0. - In
gduap.py
set the variablesTORCH_HUB_DIR
,IMAGENET_VAL_DIR
, andVOC_VAL_DIR
at the top of the file.
TORCH_HUB_DIR
should be the directory where you'd like PyTorch pretrained model parameters to be saved. More info: torch.hub documentationIMAGENET_VAL_DIR
should be the directory containingILSVRC2012_devkit_t12.tar.gz
. More info: ImageNetVOC_VAL_DIR
should be the directory containingVOCdevkit
. More info: Pascal VOC2012
Note: you can skip to Section 3 if you don't intend on training new UAPs. Example perturbation vectors are provided in the perturbations/
folder for evaluation.
To optimize a UAP for a VGG-16:
python3 train.py --model vgg16 --id 12345
By default, this will use Pascal VOC 2012 Validation Set as the "substitute dataset" described in the paper. The final evaluation is performed on the ILSVRC 2012 Validation Set of 50k images. The id
option can be used to uniquely prefix the UAP files that are saved to the perturbations/
folder. Of course, you can call python3 train.py --help
to see all the options.
After a UAP is optimized using the train.py
script, evaluation is automatically performed on the ILSVRC 2012 Validation Set. You can also refer to Sample (Evaluation).ipynb
to understand how fooling rate is evaluated.
Plot.ipynb
plots perturbation vectors that have been previously optimized on VGG-16, VGG-19, GoogLeNet, ResNet-50, and ResNet-152.Sample (Evaluation).ipynb
is a sample of the steps that should be taken to evaluate a UAP's fooling rate. Namely, we must load a model to do the classification, load a dataset on which to perform evaluation, and load a UAP to perturb the images.