Advertisement

Are You Tampering with My Data?

  • Michele Alberti
  • Vinaychandran PondenkandathEmail author
  • Marcel Würsch
  • Manuel Bouillon
  • Mathias Seuret
  • Rolf Ingold
  • Marcus Liwicki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11130)

Abstract

We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate on two widely used datasets (CIFAR-10 and SVHN) that a universal modification of just one pixel per image for all the images of a class in the training set is enough to corrupt the training procedure of several state-of-the-art deep neural networks, causing the networks to misclassify any images to which the modification is applied. Our aim is to bring to the attention of the machine learning community, the possibility that even learning-based methods that are personally trained on public datasets can be subject to attacks by a skillful adversary.

Keywords

Adversarial attack Machine learning Deep neural networks Data poisoning 

Notes

Acknowledgment

The work presented in this paper has been partially supported by the HisDoc III project funded by the Swiss National Science Foundation with the grant number 205120_169618.

References

  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)Google Scholar
  2. 2.
    Alberti, M., Pondenkandath, V., Würsch, M., Ingold, R., Liwicki, M.: DeepDIVA: a highly-functional python framework for reproducible experiments, April 2018Google Scholar
  3. 3.
    Azulay, A., Weiss, Y.: Why do deep convolutional networks generalize so poorly to small image transformations? arXiv preprint arXiv:1805.12177, May 2018
  4. 4.
    Behzadan, V., Munir, A.: Vulnerability of deep reinforcement learning to policy induction attacks. In: Perner, P. (ed.) MLDM 2017. LNCS (LNAI), vol. 10358, pp. 262–275. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-62416-7_19CrossRefGoogle Scholar
  5. 5.
    Bekker, A.J., Goldberger, J.: Training deep neural-networks based on unreliable labels. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2682–2686. IEEE, March 2016.  https://doi.org/10.1109/ICASSP.2016.7472164
  6. 6.
    Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40994-3_25CrossRefGoogle Scholar
  7. 7.
    Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389, June 2012
  8. 8.
    Biggio, B., Pillai, I., Rota Bulò, S., Ariu, D., Pelillo, M., Roli, F.: Is data clustering in adversarial settings secure? In: Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security, AISec 2013 (2013).  https://doi.org/10.1145/2517312.2517321
  9. 9.
    Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. (2011).  https://doi.org/10.1613/jair.606CrossRefGoogle Scholar
  10. 10.
    Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 3–14. ACM (2017)Google Scholar
  11. 11.
    Cretu, G.F., Stavrou, A., Locasto, M.E., Stolfo, S.J., Keromytis, A.D.: Casting out demons: sanitizing training data for anomaly sensors. In: Proceedings - IEEE Symposium on Security and Privacy (2008).  https://doi.org/10.1109/SP.2008.11
  12. 12.
    Elsayed, G.F., et al.: Adversarial examples that fool both human and computer vision. arXiv Preprint (2018)Google Scholar
  13. 13.
    Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Madry, A.: A rotation and a translation suffice: fooling CNNs with simple transformations. arXiv preprint arXiv:1712.02779, December 2017
  14. 14.
    Evtimov, I., et al.: Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945 (2017)
  15. 15.
    Fan, Y., Yezzi, A.: Towards an understanding of neural networks in natural-image spaces. arXiv preprint arXiv:1801.09097 (2018)
  16. 16.
    Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980).  https://doi.org/10.1007/BF00344251CrossRefzbMATHGoogle Scholar
  17. 17.
    Fukushima, K.: Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw. (1988).  https://doi.org/10.1016/0893-6080(88)90014-7CrossRefGoogle Scholar
  18. 18.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  19. 19.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  20. 20.
    Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, vol. 1, p. 3 (2017)Google Scholar
  21. 21.
    Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B.I., Tygar, J.D.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, AISec 2011, p. 43. ACM Press, New York (2011).  https://doi.org/10.1145/2046684.2046692
  22. 22.
    Huang, S., Papernot, N., Goodfellow, I., Duan, Y., Abbeel, P.: Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284, February 2017
  23. 23.
    Ittelson, W.H., Kilpatrick, F.P.: Experiments in perception. Sci. Am. 185(august), 50–56 (1951).  https://doi.org/10.2307/24945240CrossRefGoogle Scholar
  24. 24.
    Jindal, I., Nokleby, M., Chen, X.: Learning deep networks from noisy labels with dropout regularization. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2017).  https://doi.org/10.1109/ICDM.2016.124
  25. 25.
    Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A.: Undoing the damage of dataset bias. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 158–171. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33718-5_12CrossRefGoogle Scholar
  26. 26.
    Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730, March 2017
  27. 27.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report. Citeseer (2009)Google Scholar
  28. 28.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  29. 29.
    Langner, R.: Stuxnet: dissecting a cyberwarfare weapon. IEEE Secur. Priv. 9(3), 49–51 (2011).  https://doi.org/10.1109/MSP.2011.67CrossRefGoogle Scholar
  30. 30.
    Lassance, C.E.R.K., Gripon, V., Ortega, A.: Laplacian power networks: bounding indicator function smoothness for adversarial defense. arXiv preprint arXiv:1805.10133, May 2018
  31. 31.
    LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. (1989).  https://doi.org/10.1162/neco.1989.1.4.541CrossRefGoogle Scholar
  32. 32.
    Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y., Sun, M.: Tactics of adversarial attack on deep reinforcement learning agents. In: IJCAI International Joint Conference on Artificial Intelligence (2017).  https://doi.org/10.24963/ijcai.2017/525
  33. 33.
    Mei, S., Zhu, X.: Using machine teaching to identify optimal training-set attacks on machine learners. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)Google Scholar
  34. 34.
    Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. arXiv preprint (2017)Google Scholar
  35. 35.
    Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. arXiv preprint arXiv:1708.08689, August 2017
  36. 36.
    Nelson, B., et al.: Exploiting machine learning to subvert your spam filter (2008)Google Scholar
  37. 37.
    Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011)Google Scholar
  38. 38.
    Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015).  https://doi.org/10.1109/CVPR.2015.7298640
  39. 39.
    Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016 (2016).  https://doi.org/10.1109/SP.2016.41
  40. 40.
    Paszke, A., et al.: Automatic differentiation in pytorch (2017)Google Scholar
  41. 41.
    Rodner, E., Simon, M., Fisher, R.B., Denzler, J.: Fine-grained recognition in the noisy wild: sensitivity analysis of convolutional neural networks approaches. arXiv preprint arXiv:1610.06756, October 2016
  42. 42.
    Rubinstein, B.I., et al.: ANTIDOTE. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, IMC 2009, p. 1. ACM Press, New York (2009).  https://doi.org/10.1145/1644893.1644895
  43. 43.
    Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS 2016 (2016).  https://doi.org/10.1145/2976749.2978392
  44. 44.
    Shen, S., Tople, S., Saxena, P.: AUROR: defending against poisoning attacks in collaborative deep learning systems. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (2016).  https://doi.org/10.1145/2991079.2991125
  45. 45.
    SigOpt API: SigOpt Reference Manual (2014). http://www.sigopt.com
  46. 46.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  47. 47.
    Steinhardt, J., Koh, P.W., Liang, P.: Certified defenses for data poisoning attacks. arXiv preprint arXiv:1706.03691, June 2017
  48. 48.
    Strauss, T., Hanselmann, M., Junginger, A., Ulmer, H.: Ensemble methods as a defense to adversarial perturbations against deep neural networks. arXiv preprint arXiv:1709.03423 (2017)
  49. 49.
    Su, J., Vargas, D.V., Kouichi, S.: One pixel attack for fooling deep neural networks. arXiv preprint arXiv:1710.08864 (2017)
  50. 50.
    Svoboda, J., Masci, J., Monti, F., Bronstein, M.M., Guibas, L.: PeerNets: exploiting peer wisdom against adversarial attacks. arXiv preprint arXiv:1806.00088, May 2018
  51. 51.
    Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, pp. 1–10 (2013).  https://doi.org/10.1021/ct2009208CrossRefGoogle Scholar
  52. 52.
    Tommasi, T., Patricia, N., Caputo, B., Tuytelaars, T.: A deeper look at dataset bias. In: Csurka, G. (ed.) Domain Adaptation in Computer Vision Applications. ACVPR, pp. 37–55. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-58347-1_2CrossRefGoogle Scholar
  53. 53.
    Tommasi, T., Tuytelaars, T.: A testbed for cross-dataset analysis. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8927, pp. 18–31. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-16199-0_2CrossRefGoogle Scholar
  54. 54.
    Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1521–1528. IEEE (2011)Google Scholar
  55. 55.
    Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)
  56. 56.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)Google Scholar
  57. 57.
    Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning? (2015)Google Scholar
  58. 58.
    Zantedeschi, V., Nicolae, M.I., Rawat, A.: Efficient defenses against adversarial attacks. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 39–49. ACM (2017)Google Scholar
  59. 59.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  60. 60.
    Zou, M., Shi, Y., Wang, C., Li, F., Song, W., Wang, Y.: PoTrojan: powerful neural-level trojan designs in deep learning models. arXiv preprint arXiv:1802.03043 (2018)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Michele Alberti
    • 1
  • Vinaychandran Pondenkandath
    • 1
    Email author
  • Marcel Würsch
    • 1
  • Manuel Bouillon
    • 1
  • Mathias Seuret
    • 1
  • Rolf Ingold
    • 1
  • Marcus Liwicki
    • 1
    • 2
  1. 1.Document Image and Voice Analysis Group (DIVA)University of FribourgFribourgSwitzerland
  2. 2.Machine Learning GroupLuleå University of TechnologyLuleåSweden

Personalised recommendations