Skip to main content

Adversarial Examples in Deep Neural Networks: An Overview

  • Chapter
  • First Online:
Deep Learning: Algorithms and Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 865))

Abstract

Deep learning architectures are vulnerable to adversarial perturbations. They are added to the input and alter drastically the output of deep networks. These instances are called adversarial examples. They are observed in various learning tasks from supervised learning to unsupervised and reinforcement learning. In this chapter, we review some of the most important highlights in theory and practice of adversarial examples. The focus is on designing adversarial attacks, theoretical investigation into the nature of adversarial examples, and establishing defenses against adversarial attacks. A common thread in the design of adversarial attacks is the perturbation analysis of learning algorithms. Many existing algorithms rely implicitly on perturbation analysis for generating adversarial examples. The summary of most powerful attacks are presented in this light. We overview various theories behind the existence of adversarial examples as well as theories that consider the relation between the generalization error and adversarial robustness. Finally, various defenses against adversarial examples are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Similar to the so-called \(\ell _0\)-norm, this is not a proper norm.

  2. 2.

    A colorization model predicts the color values for every pixel in a given gray-scale image.

  3. 3.

    We call the pair (pq) dual if the corresponding norms are dual. In particular \(1/p+1/q=1\).

  4. 4.

    In this section, we focus mainly on binary classification examples assuming that the results can be extended without particular difficulty to multi-class classification problems.

  5. 5.

    They consider semi-random noise as well, however, we restrict ourselves to simple random noise.

  6. 6.

    In that work, the \(\ell _\infty \)-constraint \(\Vert \varvec{\eta }\Vert _\infty \le \varepsilon = 0.3\) is employed to train models where the input values were between 0 and 1.

  7. 7.

    Pruning consists of setting to zero smallest weights (in absolute value) of the a given weight matrix, thus enforcing a certain level of sparsity. The amount of weights to be set to zero is arbitrarily chosen. Usually pruning requires an extra phase of retraining (fine-tunning of the remaining non-zero weights) to compensate for the performance degradation caused by the initial manipulation of the weights.

References

  1. Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)

    Article  Google Scholar 

  2. Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (2009)

    MATH  Google Scholar 

  3. Arora, S., Ge, R., Neyshabur, B., Zhang, Y.: Stronger generalization bounds for deep nets via a compression approach. In: International Conference on Machine Learning, pp. 254–263 (2018)

    Google Scholar 

  4. Athalye, A., Carlini, N.: On the Robustness of the CVPR 2018 white-box adversarial example defenses (Apr 2018). arXiv:1804.03286

  5. Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: International Conference on Machine Learning (2018)

    Google Scholar 

  6. Attias, I., Kontorovich, A., Mansour, Y.: Improved generalization bounds for robust learning. In: Algorithmic Learning Theory, pp. 162–183 (2019)

    Google Scholar 

  7. Balda, E.R., Behboodi, A., Mathar, R.: On generation of adversarial examples using convex programming. In: 52-th Asilomar Conference on Signals Systems, and Computers, pp. 1–6. Pacific Grove, California, USA (Oct 2018)

    Google Scholar 

  8. Balda, E.R., Behboodi, A., Mathar, R.: Perturbation analysis of learning algorithms: generation of adversarial examples from classification to regression. IEEE Trans. Signal Process. (2018)

    Google Scholar 

  9. Barreno, M., Nelson, B., Joseph, A.D., Tygar, J.D.: The security of machine learning. Mach Learn 81(2), 121–148 (2010)

    Article  MathSciNet  Google Scholar 

  10. Bartlett, P.L., Foster, D.J., Telgarsky, M.J.: Spectrally-normalized margin bounds for neural networks. In: Advances in Neural Information Processing Systems, vol. 30, pp. 6240–6249. Curran Associates, Inc. (2017)

    Google Scholar 

  11. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)

    Google Scholar 

  12. Cisse, M., Adi, Y., Neverova, N., Keshet, J.: Houdini: fooling deep structured prediction models (2017). arXiv:1707.05373

  13. Cullina, D., Bhagoji, A.N., Mittal, P.: PAC-learning in the presence of adversaries. In: Advances in Neural Information Processing Systems, vol. 31, pp. 228–239. Curran Associates, Inc. (2018)

    Google Scholar 

  14. Diochnos, D., Mahloujifar, S., Mahmoody, M.: Adversarial risk and robustness: general definitions and implications for the uniform distribution. In: Advances in Neural Information Processing Systems, pp. 10359–10368 (2018)

    Google Scholar 

  15. Dohmatob, E.: Limitations of adversarial robustness: strong No Free Lunch Theorem (Oct 2018). arXiv:1810.04065 [cs, stat]

  16. Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: The robustness of deep networks: a geometrical perspective. IEEE Signal Process. Mag. 34(6), 50–62 (2017)

    Article  Google Scholar 

  17. Fawzi, A., Fawzi, O., Frossard, P.: Fundamental limits on adversarial robustness. In: Proceedings of ICML, Workshop on Deep Learning (2015)

    Google Scholar 

  18. Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2018)

    Article  MathSciNet  Google Scholar 

  19. Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P.: Robustness of classifiers: from adversarial to random noise. In: Advances in Neural Information Processing Systems, vol. 29, pp. 1632–1640. Curran Associates, Inc. (2016)

    Google Scholar 

  20. Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P., Soatto, S.: Classification regions of deep neural networks (May 2017). arXiv:1705.09552

  21. Franceschi, J.Y., Fawzi, A., Fawzi, O.: Robustness of classifiers to uniform \(\ell _p\) and Gaussian noise. In: 21st International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 84, p. 9. Lanzarote, Spain (2018)

    Google Scholar 

  22. Gilmer, J., Metz, L., Faghri, F., Schoenholz, S., Raghu, M., Wattenberg, M., Goodfellow, I.: Adversarial spheres. In: ICLR 2018-Workshop Track (2018)

    Google Scholar 

  23. Golowich, N., Rakhlin, A., Shamir, O.: Size-independent sample complexity of neural networks. In: Bubeck, S., Perchet, V., Rigollet, P. (eds.) Proceedings of the 31st Conference On Learning Theory. Proceedings of Machine Learning Research, vol. 75, pp. 297–299. PMLR (Jul 2018)

    Google Scholar 

  24. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (Dec 2014)

    Google Scholar 

  25. Guo, Y., Zhang, C., Zhang, C., Chen, Y.: Sparse DNNs with improved adversarial robustness. In: Advances in Neural Information Processing Systems, vol. 31, pp. 240–249. Curran Associates, Inc. (2018)

    Google Scholar 

  26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (Jun 2016)

    Google Scholar 

  27. Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation. In: NIPS (2017)

    Google Scholar 

  28. Hendrik Metzen, J., Chaithanya Kumar, M., Brox, T., Fischer, V.: Universal adversarial perturbations against semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2755–2764 (2017)

    Google Scholar 

  29. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Google Scholar 

  30. Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  31. Khim, J., Loh, P.L.: Adversarial risk bounds via function transformation (Oct 2018). arXiv:1810.09519

  32. Khrulkov, V., Oseledets, I.V.: Art of singular vectors and universal adversarial perturbations. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8562–8570 (2018)

    Google Scholar 

  33. Kos, J., Fischer, I., Song, D.: Adversarial examples for generative models. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 36–42 (May 2018)

    Google Scholar 

  34. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  35. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2016). arXiv:1607.02533

  36. Langenberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the effect of low-rank weights on adversarial robustness of neural networks (2019). arXiv:1901.10371

  37. LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Shape, Contour and Grouping in Computer Vision, pp. 319–345. Springer (1999)

    Google Scholar 

  38. Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018)

    Google Scholar 

  39. Lin, M., Chen, Q., Yan, S.: Network in network (2013). arXiv:1312.4400

  40. Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y., Sun, M.: Tactics of adversarial attack on deep reinforcement learning agents. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3756–3762. IJCAI’17, AAAI Press, Melbourne, Australia (2017)

    Google Scholar 

  41. Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. In: ICLR 2017 (2017)

    Google Scholar 

  42. Luo, Y., Boix, X., Roig, G., Poggio, T., Zhao, Q.: Foveation-based mechanisms alleviate adversarial examples (Nov 2015). arXiv:1511.06292

  43. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  44. Mahloujifar, S., Diochnos, D.I., Mahmoody, M.: The curse of concentration in robust learning: evasion and poisoning attacks from concentration of measure. In: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI) (2019)

    Google Scholar 

  45. Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1765–1773 (2017)

    Google Scholar 

  46. Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P., Soatto, S.: Robustness of classifiers to universal perturbations: a geometric perspective. In: International Conference on Learning Representations (2018)

    Google Scholar 

  47. Moosavi Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  48. Moosavi-Dezfooli, S.M., Fawzi, A., Uesato, J., Frossard, P.: Robustness via curvature regularization, and vice versa (Nov 2018). arXiv:1811.09716 [cs, stat]

  49. Neyshabur, B., Bhojanapalli, S., Srebro, N.: A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  50. Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., Sohl-Dickstein, J.: Sensitivity and generalization in neural networks: an empirical study. In: International Conference on Learning Representations (2018)

    Google Scholar 

  51. Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ACM (2017)

    Google Scholar 

  52. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016)

    Google Scholar 

  53. Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: Military Communications Conference, MILCOM 2016-2016 IEEE, pp. 49–54. IEEE (2016)

    Google Scholar 

  54. Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)

    Google Scholar 

  55. Poole, B., Lahiri, S., Raghu, M., Sohl-Dickstein, J., Ganguli, S.: Exponential expressivity in deep neural networks through transient chaos. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3360–3368. Curran Associates, Inc. (2016)

    Google Scholar 

  56. Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: International Conference on Learning Representations (2018)

    Google Scholar 

  57. Rao, N., Recht, B., Nowak, R.: Universal measurement bounds for structured sparse signal recovery. In: Artificial Intelligence and Statistics, pp. 942–950 (Mar 2012)

    Google Scholar 

  58. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  59. Sabour, S., Cao, Y., Faghri, F., Fleet, D.J.: Adversarial manipulation of deep representations. In: ICLR 2016 (2016)

    Google Scholar 

  60. Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: International Conference on Learning Representations (2018)

    Google Scholar 

  61. Sanyal, A., Kanade, V., Torr, P.H.S.: Intriguing properties of learned representations (2018)

    Google Scholar 

  62. Sarkar, S., Bansal, A., Mahbub, U., Chellappa, R.: Upset and angri: breaking high performance image classifiers (2017). arXiv:1707.01159

  63. Schmidt, L., Santurkar, S., Tsipras, D., Talwar, K., Madry, A.: Adversarially robust generalization requires more data. In: Advances in Neural Information Processing Systems, pp. 5014–5026 (2018)

    Google Scholar 

  64. Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York, NY, USA (2014)

    Book  Google Scholar 

  65. Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: International Conference on Learning Representations (2018)

    Google Scholar 

  66. Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. (2019)

    Google Scholar 

  67. Suggala, A.S., Prasad, A., Nagarajan, V., Ravikumar, P.: Revisiting adversarial risk (Jun 2018). arXiv:1806.02924 [cs, stat]

  68. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  69. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)

    Google Scholar 

  70. Tabacof, P., Tavares, J., Valle, E.: Adversarial images for variational autoencoders (2016). arXiv:1612.00155

  71. Tanay, T., Griffin, L.: A boundary tilting persepective on the phenomenon of adversarial examples (Aug 2016). arXiv:1608.07690 [cs, stat]

  72. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (2018)

    Google Scholar 

  73. Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Conference on Learning Representations (2019)

    Google Scholar 

  74. Wang, B., Gao, J., Qi, Y.: A theoretical framework for robustness of (Deep) classifiers against adversarial examples. In: International Conference on Learning Representations (2017)

    Google Scholar 

  75. Wang, L., Ding, G.W., Huang, R., Cao, Y., Lui, Y.C.: Adversarial robustness of pruned neural networks (2018). https://openreview.net/forum?id=SJGrAisIz

  76. Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection. In: International Conference on Computer Vision. IEEE (2017)

    Google Scholar 

  77. Yin, D., Ramchandran, K., Bartlett, P.: Rademacher complexity for adversarially robust generalization (Oct 2018). arXiv:1810.11914 [cs, stat]

  78. Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 1–20 (2019)

    Google Scholar 

  79. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR 2018 (2017)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the reviewers for their fruitful feedbacks.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arash Behboodi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Balda, E.R., Behboodi, A., Mathar, R. (2020). Adversarial Examples in Deep Neural Networks: An Overview. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Algorithms and Applications. Studies in Computational Intelligence, vol 865. Springer, Cham. https://doi.org/10.1007/978-3-030-31760-7_2

Download citation

Publish with us

Policies and ethics