Initial code commit

Co-authored-by: Sara Fridovich-Keil <[email protected]> Co-authored-by: Brian Bartoldson <[email protected]> Co-authored-by: James Diffenderfer <[email protected]> Co-authored-by: Bhavya Kailkhura <[email protected]>
sarafridov · Nov 28, 2022 · adcf20a · adcf20a
1 parent 165420c
commit adcf20a
Show file tree

Hide file tree

Showing 8 changed files with 1,546 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -1,10 +1,36 @@
 # RobustNets
-RobustNets benchmark models and code (coming soon!)
+RobustNets benchmark models and code.
 
 Code and model release for:
 
 [**Models Out of Line: A Fourier Lens on Distribution Shift Robustness**](https://arxiv.org/abs/2207.04075)
 
+## Getting started
+We recommend cloning this repository and running the data download script `download_RobustNets.sh`. Alternatively, you can manually download the RobustNets model state dicts here: https://github.com/sarafridov/RobustNets/releases.
+
+After downloading the RobustNets dataset and code, running the program `RobustNets.py` will ensure you have downloaded the whole dataset and everything is working correctly. This program expects access to a GPU and takes about 10 minutes to run due to its computation of CIFAR-10 test accuracy for each model. You can skip the accuracy checks if they're too burdensome, but they do help confirm that we're working with the same data. Assuming you downloaded the RobustNets assets to the directory `RobustNets` and want to use the directory `tempC` to store `torchvision`'s CIFAR-10 data, enter the following command:
+
+`python RobustNets.py --PATH_TO_RobustNets=RobustNets --PATH_TO_c10=tempC`
+
+`RobustNets.py` contains the function `iterate_over_RobustNets`, which should give you all the information you need to understand how to iterate over identifiers for each model in the RobustNets dataset. This program also contains `check_RobustNets_c10_accuracy`, which illustrates how to load these models given their identifiers. In particular, you must use the function `instantiate_model`, which takes the model identifier and the path to RobustNets as arguments: `model = instantiate_model(model_string, PATH_TO_RobustNets)`.
+
+## Computing metrics
+
+All metrics applied in our paper to the RobustNets models are in the dictionary `RobustNets/metric_and_OOD_var_dict.json`, but you may want to recompute these metrics or compute them on other models. The following examples illustrate how to do this. To use a model that isn't the default model, you will have to specify that model via the args (see `get_args` in `utilities.py`). To use a model outside of the RobustNets dataset, you will have to modify the metric-computation programs to load your desired model rather than a RobustNets model. Finally, note that the interpolation programs create and save additional data at the specified `PATH_TO_interp`. 
+
+Compute Fourier interpolation metrics:
+
+`python fourier_interpolation.py --PATH_TO_RobustNets=RobustNets --PATH_TO_interp=tempI --PATH_TO_c10=tempC`
+
+Compute pixel interpolation metrics:
+
+`python pixel_interpolation.py --PATH_TO_RobustNets=RobustNets --PATH_TO_interp=tempI --PATH_TO_c10=tempC`
+
+Compute Jacobian norm:
+
+`python jacobian_norm.py --PATH_TO_RobustNets=RobustNets --PATH_TO_c10=tempC`
+
+
 ## Citation
 ```
 @inproceedings{modelsoutofline,

diff --git a/RobustNets.py b/RobustNets.py
@@ -0,0 +1,140 @@
+# Confirm RobustNets dataset is set up correctly with various checks.
+# Illustrate how to access RobustNets models (see `iterate_over_RobustNets`).
+# Expected output: "All checks passed!"
+
+from models import (Conv8, ResNet18, VGG16, cConv8, cResNet18, cVGG16,
+                                instantiate_model)
+from utilities import get_model_string, get_args
+import torch
+from torchvision import datasets, transforms
+import json
+from functools import partial
+import os
+from pathlib import Path
+from tqdm import tqdm
+
+def iterate_over_RobustNets(function_applied_to_each_model):
+    """
+    Iterates over all models in the RobustNets dataset, applying the
+    `function_applied_to_each_model` to the unique identifier of
+    each RobustNets model (`model_string`).
+
+    As illustrated in `check_RobustNets_c10_accuracy`, `model_string`
+    and the path to the RobustNets folder are required to instantiate
+    a model (via `instantiate_model`).
+
+    The following for loops and if conditions show the span of RobustNets.
+    """
+    model_names = ['Conv8', 'ResNet18', 'VGG16']
+    pruning_approaches = ['biprop', 'edgepopup', 'GMP', 'FT', 'lrr', 'lth']
+    sparsity_levels = [0.0, 0.5, 0.6, 0.8, 0.9, 0.95]
+    sparsity_types = ['globally', 'layerwise']
+    data_augmentations = ['augmix', 'gaussian', 'clean']
+
+    for pruning_approach in tqdm(pruning_approaches):
+        for model_name in model_names:
+            for sparsity in sparsity_levels:
+                for sparsity_type in sparsity_types:
+                    for data_augmentation in data_augmentations:
+                        if sparsity == 0:
+                            if pruning_approach not in ['lrr']:
+                                continue # we only have 1 model with 0 sparsity (i.e., 1 unpruned model)
+                        if sparsity_type == 'layerwise':
+                            if (pruning_approach in ['lrr', 'lth']) or (sparsity==0.95 and model_name=='Conv8'):
+                                continue # 'lth' and 'lrr' pruning was always done globally; Conv8 layerwise 0.95 sparsity excluded
+
+                        # define unique model string in terms of variable values
+                        model_string = get_model_string(model_name, data_augmentation, pruning_approach, sparsity_type, sparsity)
+                        function_applied_to_each_model(model_string)
+
+def check_RobustNets_c10_accuracy(test_loader, PATH_TO_RobustNets, metric_and_OOD_var_dict, model_string):
+    """
+    For each CIFAR-10 model analyzed in "Models Out of Line...", load the model,
+    then compute its test accuracy. The model loaded correctly if this test accuracy
+    matches the accuracy we used in the paper, which was computed after training.
+    """
+    state_dict_name = model_string + '_state_dict.pt'
+    # build model and load its state dict from the RobustNets location
+    model = instantiate_model(model_string, PATH_TO_RobustNets)
+    # confirm loaded model's accuracy matches accuracy found during training
+    test_acc = compute_test_accuracy(test_loader, model)
+    c10_acc = metric_and_OOD_var_dict[model_string]['cifar10_acc']
+    acc_string = f'c10_acc was {c10_acc}, computed acc was {test_acc}'
+    assert(test_acc == c10_acc), acc_string
+    print(model_string + f' c10 acc matches precomputed acc ({c10_acc}%)')
+
+def check_RobustNets_existence(PATH_TO_RobustNets, metric_and_OOD_var_dict):
+    """
+    Confirm 1) there's a model for each `metric_and_OOD_var_dict` key,
+    2) there's a key for each model, and 3) the iterator is comprehensive.
+    """
+    # checks 1 and 2
+    files = os.listdir(PATH_TO_RobustNets)
+    count = 0
+    expected_count = len(metric_and_OOD_var_dict)
+    for f in files:
+        if f[-len('_state_dict.pt'):] == '_state_dict.pt':
+            assert f.replace('_state_dict.pt',
+                '') in metric_and_OOD_var_dict, f'RobustNets model {f} not in dictionary keys.'
+            count += 1
+    if count != expected_count:
+        print(f'Expected {expected_count} RobustNets models but found {count}. Your download may be incomplete.')
+        # figure out which model is missing
+        for key in metric_and_OOD_var_dict:
+            assert os.path.exists(
+                PATH_TO_RobustNets/key+'_state_dict.pt'), f'Model {key} not in RobustNets directory.'
+
+    # check 3, is the iterator comprehensive?
+    global iterator_count 
+    iterator_count = 0
+    def check_vals_in_iterator(model_string):
+        global iterator_count
+        if model_string in metric_and_OOD_var_dict:
+            iterator_count+=1
+        else:
+            assert False, f'iterator created unexpected value {model_string}'
+    iterate_over_RobustNets(check_vals_in_iterator)
+    assert iterator_count == expected_count, f'Expected {expected_count} RobustNets models but iterated over {iterator_count}. The iterator may have been modified.'
+
+def get_c10_test_loader(data_dir):
+    normalize = transforms.Normalize(
+                mean=[0.491, 0.482, 0.447], std=[0.247, 0.243, 0.262])
+    c10_transforms = transforms.Compose([transforms.ToTensor(), normalize])             
+    test_set = datasets.CIFAR10(root=data_dir, 
+                                            train=False,
+                                            download=True,
+                                            transform=c10_transforms)
+    return torch.utils.data.DataLoader(test_set, batch_size=400, num_workers=4, pin_memory=True)
+
+def compute_test_accuracy(test_loader, model):
+    """
+    Compute CIFAR-10 test accuracy on GPU
+    """
+    model.cuda()
+    model.eval()
+    y_hats = torch.tensor([], dtype=torch.int64).cuda()
+    y_s = torch.tensor([], dtype=torch.int64)
+    with torch.no_grad():
+        for x, y in test_loader:
+            y_hat = model(x.cuda())
+            y_hats = torch.cat((y_hats, y_hat.argmax(1)))
+            y_s = torch.cat((y_s, y))
+    return round( (y_hats.cpu() == y_s).sum().item() / len(y_s) * 100, 2)
+
+if __name__=='__main__':
+    args = get_args()
+    PATH_TO_RobustNets = Path(args.PATH_TO_RobustNets)
+    assert args.PATH_TO_c10, 'you must specify a location for the CIFAR-10 data we will create using the arg --PATH_TO_c10'
+    PATH_TO_c10_data = Path(args.PATH_TO_c10)
+    PATH_TO_metric_and_OOD_var_dict = 'RobustNets/metric_and_OOD_var_dict.json'
+    with open(PATH_TO_metric_and_OOD_var_dict, 'r') as f:
+        metric_and_OOD_var_dict = json.load(f)
+    print('**********************************\nRunning RobustNets existence checks.')
+    check_RobustNets_existence(PATH_TO_RobustNets, metric_and_OOD_var_dict)
+    print('RobustNets existence checks passed.')
+    print('**********************************\nRunning RobustNets accuracy checks.')
+    iterate_over_RobustNets(
+        partial(check_RobustNets_c10_accuracy, get_c10_test_loader(PATH_TO_c10_data),
+        PATH_TO_RobustNets, metric_and_OOD_var_dict))
+    print('RobustNets accuracy checks passed.')
+    print('**********************************\nAll checks passed!')
diff --git a/RobustNets/metric_and_OOD_var_dict.json b/RobustNets/metric_and_OOD_var_dict.json