Python code for Universal adversarial audio perturbations [1] generation. The target model which is used here is the combination of Sincnet [2] +VGG19. Keras implementation of SincNet (M. Ravanelli - Y. Bengio) [2] is used. The model is trained on UrbanSound8k dataset [3]. For more information on data normalization please refer to our paper [1].
Two methods are used for UAP generation The first method is based on an iterative, greedy approach that is well-known in computer vision [4]: it aggregates small perturbations to the input so as to push it to the decision boundary. The second method, which is the main contribution of our paper [1], is a novel penalty formulation, which finds targeted and untargeted universal adversarial perturbations.
- tensorflow-gpu>=1.12.0
- keras>=2.2.4
- pysoundfile (
pip install pysoundfile
) - numpy
- pickle
- sklearn
[1] Abdoli, Sajjad, et al. "Universal adversarial audio perturbations." Arxiv preprint arXiv:1908.03173 (2019).
[2] Mirco Ravanelli, Yoshua Bengio, “Speaker Recognition from raw waveform with SincNet” Arxiv
[3] Justin Salamon, Christopher Jacoby, and Juan Pablo Bello. "A dataset and taxonomy for urban sound research." Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.
[4] Moosavi-Dezfooli, Seyed-Mohsen, et al. "Universal adversarial perturbations." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.