Adversarial Examples in Deep Neural Networks: An Overview

Balda, Emilio Rafael; Behboodi, Arash; Mathar, Rudolf

doi:10.1007/978-3-030-31760-7_2

Emilio Rafael Balda⁴,
Arash Behboodi⁴ &
Rudolf Mathar⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 865))

3123 Accesses
5 Citations

Abstract

Deep learning architectures are vulnerable to adversarial perturbations. They are added to the input and alter drastically the output of deep networks. These instances are called adversarial examples. They are observed in various learning tasks from supervised learning to unsupervised and reinforcement learning. In this chapter, we review some of the most important highlights in theory and practice of adversarial examples. The focus is on designing adversarial attacks, theoretical investigation into the nature of adversarial examples, and establishing defenses against adversarial attacks. A common thread in the design of adversarial attacks is the perturbation analysis of learning algorithms. Many existing algorithms rely implicitly on perturbation analysis for generating adversarial examples. The summary of most powerful attacks are presented in this light. We overview various theories behind the existence of adversarial examples as well as theories that consider the relation between the generalization error and adversarial robustness. Finally, various defenses against adversarial examples are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Similar to the so-called \(\ell _0\)-norm, this is not a proper norm.
2.
A colorization model predicts the color values for every pixel in a given gray-scale image.
3.
We call the pair (p, q) dual if the corresponding norms are dual. In particular \(1/p+1/q=1\).
4.
In this section, we focus mainly on binary classification examples assuming that the results can be extended without particular difficulty to multi-class classification problems.
5.
They consider semi-random noise as well, however, we restrict ourselves to simple random noise.
6.
In that work, the \(\ell _\infty \)-constraint \(\Vert \varvec{\eta }\Vert _\infty \le \varepsilon = 0.3\) is employed to train models where the input values were between 0 and 1.
7.
Pruning consists of setting to zero smallest weights (in absolute value) of the a given weight matrix, thus enforcing a certain level of sparsity. The amount of weights to be set to zero is arbitrarily chosen. Usually pruning requires an extra phase of retraining (fine-tunning of the remaining non-zero weights) to compensate for the performance degradation caused by the initial manipulation of the weights.

References

Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)
Article Google Scholar
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (2009)
MATH Google Scholar
Arora, S., Ge, R., Neyshabur, B., Zhang, Y.: Stronger generalization bounds for deep nets via a compression approach. In: International Conference on Machine Learning, pp. 254–263 (2018)
Google Scholar
Athalye, A., Carlini, N.: On the Robustness of the CVPR 2018 white-box adversarial example defenses (Apr 2018). arXiv:1804.03286
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: International Conference on Machine Learning (2018)
Google Scholar
Attias, I., Kontorovich, A., Mansour, Y.: Improved generalization bounds for robust learning. In: Algorithmic Learning Theory, pp. 162–183 (2019)
Google Scholar
Balda, E.R., Behboodi, A., Mathar, R.: On generation of adversarial examples using convex programming. In: 52-th Asilomar Conference on Signals Systems, and Computers, pp. 1–6. Pacific Grove, California, USA (Oct 2018)
Google Scholar
Balda, E.R., Behboodi, A., Mathar, R.: Perturbation analysis of learning algorithms: generation of adversarial examples from classification to regression. IEEE Trans. Signal Process. (2018)
Google Scholar
Barreno, M., Nelson, B., Joseph, A.D., Tygar, J.D.: The security of machine learning. Mach Learn 81(2), 121–148 (2010)
Article MathSciNet Google Scholar
Bartlett, P.L., Foster, D.J., Telgarsky, M.J.: Spectrally-normalized margin bounds for neural networks. In: Advances in Neural Information Processing Systems, vol. 30, pp. 6240–6249. Curran Associates, Inc. (2017)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Google Scholar
Cisse, M., Adi, Y., Neverova, N., Keshet, J.: Houdini: fooling deep structured prediction models (2017). arXiv:1707.05373
Cullina, D., Bhagoji, A.N., Mittal, P.: PAC-learning in the presence of adversaries. In: Advances in Neural Information Processing Systems, vol. 31, pp. 228–239. Curran Associates, Inc. (2018)
Google Scholar
Diochnos, D., Mahloujifar, S., Mahmoody, M.: Adversarial risk and robustness: general definitions and implications for the uniform distribution. In: Advances in Neural Information Processing Systems, pp. 10359–10368 (2018)
Google Scholar
Dohmatob, E.: Limitations of adversarial robustness: strong No Free Lunch Theorem (Oct 2018). arXiv:1810.04065 [cs, stat]
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: The robustness of deep networks: a geometrical perspective. IEEE Signal Process. Mag. 34(6), 50–62 (2017)
Article Google Scholar
Fawzi, A., Fawzi, O., Frossard, P.: Fundamental limits on adversarial robustness. In: Proceedings of ICML, Workshop on Deep Learning (2015)
Google Scholar
Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2018)
Article MathSciNet Google Scholar
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: Robustness of classifiers: from adversarial to random noise. In: Advances in Neural Information Processing Systems, vol. 29, pp. 1632–1640. Curran Associates, Inc. (2016)
Google Scholar
Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P., Soatto, S.: Classification regions of deep neural networks (May 2017). arXiv:1705.09552
Franceschi, J.Y., Fawzi, A., Fawzi, O.: Robustness of classifiers to uniform \(\ell _p\) and Gaussian noise. In: 21st International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 84, p. 9. Lanzarote, Spain (2018)
Google Scholar
Gilmer, J., Metz, L., Faghri, F., Schoenholz, S., Raghu, M., Wattenberg, M., Goodfellow, I.: Adversarial spheres. In: ICLR 2018-Workshop Track (2018)
Google Scholar
Golowich, N., Rakhlin, A., Shamir, O.: Size-independent sample complexity of neural networks. In: Bubeck, S., Perchet, V., Rigollet, P. (eds.) Proceedings of the 31st Conference On Learning Theory. Proceedings of Machine Learning Research, vol. 75, pp. 297–299. PMLR (Jul 2018)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (Dec 2014)
Google Scholar
Guo, Y., Zhang, C., Zhang, C., Chen, Y.: Sparse DNNs with improved adversarial robustness. In: Advances in Neural Information Processing Systems, vol. 31, pp. 240–249. Curran Associates, Inc. (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (Jun 2016)
Google Scholar
Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation. In: NIPS (2017)
Google Scholar
Hendrik Metzen, J., Chaithanya Kumar, M., Brox, T., Fischer, V.: Universal adversarial perturbations against semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2755–2764 (2017)
Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Google Scholar
Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Khim, J., Loh, P.L.: Adversarial risk bounds via function transformation (Oct 2018). arXiv:1810.09519
Khrulkov, V., Oseledets, I.V.: Art of singular vectors and universal adversarial perturbations. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8562–8570 (2018)
Google Scholar
Kos, J., Fischer, I., Song, D.: Adversarial examples for generative models. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 36–42 (May 2018)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2016). arXiv:1607.02533
Langenberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the effect of low-rank weights on adversarial robustness of neural networks (2019). arXiv:1901.10371
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Shape, Contour and Grouping in Computer Vision, pp. 319–345. Springer (1999)
Google Scholar
Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018)
Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network (2013). arXiv:1312.4400
Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y., Sun, M.: Tactics of adversarial attack on deep reinforcement learning agents. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3756–3762. IJCAI’17, AAAI Press, Melbourne, Australia (2017)
Google Scholar
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. In: ICLR 2017 (2017)
Google Scholar
Luo, Y., Boix, X., Roig, G., Poggio, T., Zhao, Q.: Foveation-based mechanisms alleviate adversarial examples (Nov 2015). arXiv:1511.06292
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018)
Google Scholar
Mahloujifar, S., Diochnos, D.I., Mahmoody, M.: The curse of concentration in robust learning: evasion and poisoning attacks from concentration of measure. In: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI) (2019)
Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1765–1773 (2017)
Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P., Soatto, S.: Robustness of classifiers to universal perturbations: a geometric perspective. In: International Conference on Learning Representations (2018)
Google Scholar
Moosavi Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Uesato, J., Frossard, P.: Robustness via curvature regularization, and vice versa (Nov 2018). arXiv:1811.09716 [cs, stat]
Neyshabur, B., Bhojanapalli, S., Srebro, N.: A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. In: International Conference on Learning Representations (2018)
Google Scholar
Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., Sohl-Dickstein, J.: Sensitivity and generalization in neural networks: an empirical study. In: International Conference on Learning Representations (2018)
Google Scholar
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ACM (2017)
Google Scholar
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016)
Google Scholar
Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: Military Communications Conference, MILCOM 2016-2016 IEEE, pp. 49–54. IEEE (2016)
Google Scholar
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)
Google Scholar
Poole, B., Lahiri, S., Raghu, M., Sohl-Dickstein, J., Ganguli, S.: Exponential expressivity in deep neural networks through transient chaos. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3360–3368. Curran Associates, Inc. (2016)
Google Scholar
Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: International Conference on Learning Representations (2018)
Google Scholar
Rao, N., Recht, B., Nowak, R.: Universal measurement bounds for structured sparse signal recovery. In: Artificial Intelligence and Statistics, pp. 942–950 (Mar 2012)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Sabour, S., Cao, Y., Faghri, F., Fleet, D.J.: Adversarial manipulation of deep representations. In: ICLR 2016 (2016)
Google Scholar
Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: International Conference on Learning Representations (2018)
Google Scholar
Sanyal, A., Kanade, V., Torr, P.H.S.: Intriguing properties of learned representations (2018)
Google Scholar
Sarkar, S., Bansal, A., Mahbub, U., Chellappa, R.: Upset and angri: breaking high performance image classifiers (2017). arXiv:1707.01159
Schmidt, L., Santurkar, S., Tsipras, D., Talwar, K., Madry, A.: Adversarially robust generalization requires more data. In: Advances in Neural Information Processing Systems, pp. 5014–5026 (2018)
Google Scholar
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York, NY, USA (2014)
Book Google Scholar
Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: International Conference on Learning Representations (2018)
Google Scholar
Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. (2019)
Google Scholar
Suggala, A.S., Prasad, A., Nagarajan, V., Ravikumar, P.: Revisiting adversarial risk (Jun 2018). arXiv:1806.02924 [cs, stat]
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
Google Scholar
Tabacof, P., Tavares, J., Valle, E.: Adversarial images for variational autoencoders (2016). arXiv:1612.00155
Tanay, T., Griffin, L.: A boundary tilting persepective on the phenomenon of adversarial examples (Aug 2016). arXiv:1608.07690 [cs, stat]
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (2018)
Google Scholar
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Conference on Learning Representations (2019)
Google Scholar
Wang, B., Gao, J., Qi, Y.: A theoretical framework for robustness of (Deep) classifiers against adversarial examples. In: International Conference on Learning Representations (2017)
Google Scholar
Wang, L., Ding, G.W., Huang, R., Cao, Y., Lui, Y.C.: Adversarial robustness of pruned neural networks (2018). https://openreview.net/forum?id=SJGrAisIz
Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection. In: International Conference on Computer Vision. IEEE (2017)
Google Scholar
Yin, D., Ramchandran, K., Bartlett, P.: Rademacher complexity for adversarially robust generalization (Oct 2018). arXiv:1810.11914 [cs, stat]
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 1–20 (2019)
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR 2018 (2017)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the reviewers for their fruitful feedbacks.

Author information

Authors and Affiliations

Institute for Theoretical Information Technology (TI), RWTH Aachen University, ICT cubes, Kopernikusstraße 16, 52074, Aachen, Germany
Emilio Rafael Balda, Arash Behboodi & Rudolf Mathar

Authors

Emilio Rafael Balda
View author publications
You can also search for this author in PubMed Google Scholar
Arash Behboodi
View author publications
You can also search for this author in PubMed Google Scholar
Rudolf Mathar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arash Behboodi .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shyi-Ming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Balda, E.R., Behboodi, A., Mathar, R. (2020). Adversarial Examples in Deep Neural Networks: An Overview. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Algorithms and Applications. Studies in Computational Intelligence, vol 865. Springer, Cham. https://doi.org/10.1007/978-3-030-31760-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-31760-7_2
Published: 24 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31759-1
Online ISBN: 978-3-030-31760-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics