Skip to content

Towards Efficient and Effective Adversarial Training, NeurIPS 2021

License

Notifications You must be signed in to change notification settings

GaurangSriramanan/NuAT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Towards Efficient and Effective Adversarial Training

This repository contains code for the implementation of our NeurIPS 2021 paper Towards Efficient and Effective Adversarial Training. Accompanying resources can be found here: [video] [poster]

  • We propose single-step Nuclear Norm Adversarial Training (NuAT), and a two step variant of the same (NuAT2), coupled with a novel cyclic-step learning rate schedule, to achieve state-of-the-art results amongst efficient training methods, and results comparable to the effective multi-step training methods.
  • We demonstrate improved performance by using exponential averaging of the network weights coupled with cyclic learning rate schedule (NuAT-WA).
  • We propose a two-step variant NuAT2-WA, that utilizes stable gradients from the weight-averaged model for stable initialization in the first attack step, yielding improved results.
  • We improve the stability and performance of the single-step defense (NuAT) further at a marginal increase in computational cost, by introducing a hybrid approach (NuAT-H) that switches adaptively between single-step and two-step optimization for attack generation.

Trained model checkpoints can be found here.

Nuclear Norm Adversarial Training (NuAT)

In this work, we propose a novel Nuclear Norm regularizer to improve the adversarial robustness of Deep Networks through the use of single-step adversarial training. Training with the proposed Nuclear Norm regularizer enforces function smoothing in the vicinity of clean samples by incorporating joint batch-statistics of adversarial samples, thereby resulting in enhanced robustness.

Nuclear-Norm based Attack: In a given minibatch, we consider to be the matrix composed of vectorized pixel values (arranged row-wise) of each image, to be a matrix of the same dimension as consisting of independently sampled Bernoulli noise, and to be the matrix containing the corresponding ground truth one-hot vectors. The following loss function which utilizes the pre-softmax values is maximized for the generation of single-step adversaries:

In single-step Nuclear Norm Adversarial Training (NuAT), the following loss function is minimized during training:

The first term in the above equation corresponds to the cross-entropy loss on clean samples, and the second term corresponds to the Nuclear-Norm of the difference in pre-softmax values of clean images and their corresponding single-step adversaries .

Results on CIFAR-10

We summarise the robust white-box evaluations on the CIFAR-10 dataset below. We kindly refer the reader to our paper for further details on the two-step adversarial training method (NuAT2), exponential weight averaging (NuAT-WA) and Hybrid Nuclear Norm Adversarial Training (NuAT-H).

Environment Settings

  • Python: 3.6.8
  • PyTorch: 1.6.0+cu101
  • TorchVision: 0.7.0+cu101

Citing this work

@inproceedings{
sriramanan2021towards,
title={Towards Efficient and Effective Adversarial Training},
author={Gaurang Sriramanan and Sravanti Addepalli and Arya Baburaj and Venkatesh Babu Radhakrishnan},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=kuK2VARZGnI}
}

About

Towards Efficient and Effective Adversarial Training, NeurIPS 2021

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%