Abstract
Deep learning architectures are vulnerable to adversarial perturbations. They are added to the input and alter drastically the output of deep networks. These instances are called adversarial examples. They are observed in various learning tasks from supervised learning to unsupervised and reinforcement learning. In this chapter, we review some of the most important highlights in theory and practice of adversarial examples. The focus is on designing adversarial attacks, theoretical investigation into the nature of adversarial examples, and establishing defenses against adversarial attacks. A common thread in the design of adversarial attacks is the perturbation analysis of learning algorithms. Many existing algorithms rely implicitly on perturbation analysis for generating adversarial examples. The summary of most powerful attacks are presented in this light. We overview various theories behind the existence of adversarial examples as well as theories that consider the relation between the generalization error and adversarial robustness. Finally, various defenses against adversarial examples are also discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Similar to the so-called \(\ell _0\)-norm, this is not a proper norm.
- 2.
A colorization model predicts the color values for every pixel in a given gray-scale image.
- 3.
We call the pair (p, q) dual if the corresponding norms are dual. In particular \(1/p+1/q=1\).
- 4.
In this section, we focus mainly on binary classification examples assuming that the results can be extended without particular difficulty to multi-class classification problems.
- 5.
They consider semi-random noise as well, however, we restrict ourselves to simple random noise.
- 6.
In that work, the \(\ell _\infty \)-constraint \(\Vert \varvec{\eta }\Vert _\infty \le \varepsilon = 0.3\) is employed to train models where the input values were between 0 and 1.
- 7.
Pruning consists of setting to zero smallest weights (in absolute value) of the a given weight matrix, thus enforcing a certain level of sparsity. The amount of weights to be set to zero is arbitrarily chosen. Usually pruning requires an extra phase of retraining (fine-tunning of the remaining non-zero weights) to compensate for the performance degradation caused by the initial manipulation of the weights.
References
Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (2009)
Arora, S., Ge, R., Neyshabur, B., Zhang, Y.: Stronger generalization bounds for deep nets via a compression approach. In: International Conference on Machine Learning, pp. 254–263 (2018)
Athalye, A., Carlini, N.: On the Robustness of the CVPR 2018 white-box adversarial example defenses (Apr 2018). arXiv:1804.03286
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: International Conference on Machine Learning (2018)
Attias, I., Kontorovich, A., Mansour, Y.: Improved generalization bounds for robust learning. In: Algorithmic Learning Theory, pp. 162–183 (2019)
Balda, E.R., Behboodi, A., Mathar, R.: On generation of adversarial examples using convex programming. In: 52-th Asilomar Conference on Signals Systems, and Computers, pp. 1–6. Pacific Grove, California, USA (Oct 2018)
Balda, E.R., Behboodi, A., Mathar, R.: Perturbation analysis of learning algorithms: generation of adversarial examples from classification to regression. IEEE Trans. Signal Process. (2018)
Barreno, M., Nelson, B., Joseph, A.D., Tygar, J.D.: The security of machine learning. Mach Learn 81(2), 121–148 (2010)
Bartlett, P.L., Foster, D.J., Telgarsky, M.J.: Spectrally-normalized margin bounds for neural networks. In: Advances in Neural Information Processing Systems, vol. 30, pp. 6240–6249. Curran Associates, Inc. (2017)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Cisse, M., Adi, Y., Neverova, N., Keshet, J.: Houdini: fooling deep structured prediction models (2017). arXiv:1707.05373
Cullina, D., Bhagoji, A.N., Mittal, P.: PAC-learning in the presence of adversaries. In: Advances in Neural Information Processing Systems, vol. 31, pp. 228–239. Curran Associates, Inc. (2018)
Diochnos, D., Mahloujifar, S., Mahmoody, M.: Adversarial risk and robustness: general definitions and implications for the uniform distribution. In: Advances in Neural Information Processing Systems, pp. 10359–10368 (2018)
Dohmatob, E.: Limitations of adversarial robustness: strong No Free Lunch Theorem (Oct 2018). arXiv:1810.04065 [cs, stat]
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: The robustness of deep networks: a geometrical perspective. IEEE Signal Process. Mag. 34(6), 50–62 (2017)
Fawzi, A., Fawzi, O., Frossard, P.: Fundamental limits on adversarial robustness. In: Proceedings of ICML, Workshop on Deep Learning (2015)
Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2018)
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: Robustness of classifiers: from adversarial to random noise. In: Advances in Neural Information Processing Systems, vol. 29, pp. 1632–1640. Curran Associates, Inc. (2016)
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P., Soatto, S.: Classification regions of deep neural networks (May 2017). arXiv:1705.09552
Franceschi, J.Y., Fawzi, A., Fawzi, O.: Robustness of classifiers to uniform \(\ell _p\) and Gaussian noise. In: 21st International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 84, p. 9. Lanzarote, Spain (2018)
Gilmer, J., Metz, L., Faghri, F., Schoenholz, S., Raghu, M., Wattenberg, M., Goodfellow, I.: Adversarial spheres. In: ICLR 2018-Workshop Track (2018)
Golowich, N., Rakhlin, A., Shamir, O.: Size-independent sample complexity of neural networks. In: Bubeck, S., Perchet, V., Rigollet, P. (eds.) Proceedings of the 31st Conference On Learning Theory. Proceedings of Machine Learning Research, vol. 75, pp. 297–299. PMLR (Jul 2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (Dec 2014)
Guo, Y., Zhang, C., Zhang, C., Chen, Y.: Sparse DNNs with improved adversarial robustness. In: Advances in Neural Information Processing Systems, vol. 31, pp. 240–249. Curran Associates, Inc. (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (Jun 2016)
Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation. In: NIPS (2017)
Hendrik Metzen, J., Chaithanya Kumar, M., Brox, T., Fischer, V.: Universal adversarial perturbations against semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2755–2764 (2017)
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Khim, J., Loh, P.L.: Adversarial risk bounds via function transformation (Oct 2018). arXiv:1810.09519
Khrulkov, V., Oseledets, I.V.: Art of singular vectors and universal adversarial perturbations. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8562–8570 (2018)
Kos, J., Fischer, I., Song, D.: Adversarial examples for generative models. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 36–42 (May 2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2016). arXiv:1607.02533
Langenberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the effect of low-rank weights on adversarial robustness of neural networks (2019). arXiv:1901.10371
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Shape, Contour and Grouping in Computer Vision, pp. 319–345. Springer (1999)
Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018)
Lin, M., Chen, Q., Yan, S.: Network in network (2013). arXiv:1312.4400
Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y., Sun, M.: Tactics of adversarial attack on deep reinforcement learning agents. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3756–3762. IJCAI’17, AAAI Press, Melbourne, Australia (2017)
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. In: ICLR 2017 (2017)
Luo, Y., Boix, X., Roig, G., Poggio, T., Zhao, Q.: Foveation-based mechanisms alleviate adversarial examples (Nov 2015). arXiv:1511.06292
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018)
Mahloujifar, S., Diochnos, D.I., Mahmoody, M.: The curse of concentration in robust learning: evasion and poisoning attacks from concentration of measure. In: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI) (2019)
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1765–1773 (2017)
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P., Soatto, S.: Robustness of classifiers to universal perturbations: a geometric perspective. In: International Conference on Learning Representations (2018)
Moosavi Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Moosavi-Dezfooli, S.M., Fawzi, A., Uesato, J., Frossard, P.: Robustness via curvature regularization, and vice versa (Nov 2018). arXiv:1811.09716 [cs, stat]
Neyshabur, B., Bhojanapalli, S., Srebro, N.: A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. In: International Conference on Learning Representations (2018)
Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., Sohl-Dickstein, J.: Sensitivity and generalization in neural networks: an empirical study. In: International Conference on Learning Representations (2018)
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ACM (2017)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016)
Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: Military Communications Conference, MILCOM 2016-2016 IEEE, pp. 49–54. IEEE (2016)
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)
Poole, B., Lahiri, S., Raghu, M., Sohl-Dickstein, J., Ganguli, S.: Exponential expressivity in deep neural networks through transient chaos. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3360–3368. Curran Associates, Inc. (2016)
Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: International Conference on Learning Representations (2018)
Rao, N., Recht, B., Nowak, R.: Universal measurement bounds for structured sparse signal recovery. In: Artificial Intelligence and Statistics, pp. 942–950 (Mar 2012)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Sabour, S., Cao, Y., Faghri, F., Fleet, D.J.: Adversarial manipulation of deep representations. In: ICLR 2016 (2016)
Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: International Conference on Learning Representations (2018)
Sanyal, A., Kanade, V., Torr, P.H.S.: Intriguing properties of learned representations (2018)
Sarkar, S., Bansal, A., Mahbub, U., Chellappa, R.: Upset and angri: breaking high performance image classifiers (2017). arXiv:1707.01159
Schmidt, L., Santurkar, S., Tsipras, D., Talwar, K., Madry, A.: Adversarially robust generalization requires more data. In: Advances in Neural Information Processing Systems, pp. 5014–5026 (2018)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York, NY, USA (2014)
Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: International Conference on Learning Representations (2018)
Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. (2019)
Suggala, A.S., Prasad, A., Nagarajan, V., Ravikumar, P.: Revisiting adversarial risk (Jun 2018). arXiv:1806.02924 [cs, stat]
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
Tabacof, P., Tavares, J., Valle, E.: Adversarial images for variational autoencoders (2016). arXiv:1612.00155
Tanay, T., Griffin, L.: A boundary tilting persepective on the phenomenon of adversarial examples (Aug 2016). arXiv:1608.07690 [cs, stat]
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (2018)
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Conference on Learning Representations (2019)
Wang, B., Gao, J., Qi, Y.: A theoretical framework for robustness of (Deep) classifiers against adversarial examples. In: International Conference on Learning Representations (2017)
Wang, L., Ding, G.W., Huang, R., Cao, Y., Lui, Y.C.: Adversarial robustness of pruned neural networks (2018). https://openreview.net/forum?id=SJGrAisIz
Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection. In: International Conference on Computer Vision. IEEE (2017)
Yin, D., Ramchandran, K., Bartlett, P.: Rademacher complexity for adversarially robust generalization (Oct 2018). arXiv:1810.11914 [cs, stat]
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 1–20 (2019)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR 2018 (2017)
Acknowledgements
The authors would like to thank the reviewers for their fruitful feedbacks.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Balda, E.R., Behboodi, A., Mathar, R. (2020). Adversarial Examples in Deep Neural Networks: An Overview. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Algorithms and Applications. Studies in Computational Intelligence, vol 865. Springer, Cham. https://doi.org/10.1007/978-3-030-31760-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-31760-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31759-1
Online ISBN: 978-3-030-31760-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)