Skip to main content

VPN: Verification of Poisoning in Neural Networks

  • Conference paper
  • First Online:
Software Verification and Formal Methods for ML-Enabled Autonomous Systems (NSV 2022, FoMLAS 2022)

Abstract

Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely model poisoning. In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class. We show how to formulate the check for model poisoning as a property that can be checked with off-the-shelf verification tools, such as Marabou and nneum, where counterexamples of failed checks constitute the triggers. We further show that the discovered triggers are ‘transferable’ from a small model to a larger, better-trained model, allowing us to analyze state-of-the art performant models trained for image classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Examples in this paper are made available open-source https://github.com/theyoucheng/vpn.

  2. 2.

    Github link: https://github.com/NeuralNetworkVerification/Marabou (commit number 54e76b2c027c79d56f14751013fd649c8673dc1b).

  3. 3.

    Github link: https://github.com/stanleybak/nnenum (commit number fd07f2b6c55ca46387954559f40992ae0c9b06b7).

References

  1. Bak, S.: nnenum: verification of relu neural networks with optimized abstraction refinement. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds.) NFM 2021. LNCS, vol. 12673, pp. 19–36. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76384-8_2

    Chapter  Google Scholar 

  2. Bak, S., Liu, C., Johnson, T.: The second international verification of neural networks competition (vnn-comp 2021): summary and results. arXiv preprint arXiv:2109.00498 (2021)

  3. Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25

  4. Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch (2017). 10.48550/ARXIV.1712.09665. https://arxiv.org/abs/1712.09665

  5. Cheng, S., Liu, Y., Ma, S., Zhang, X.: Deep feature space trojan attack of neural networks by controlled detoxification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1148–1156 (2021)

    Google Scholar 

  6. Chiang, P.Y., Ni, R., Abdelkader, A., Zhu, C., Studor, C., Goldstein, T.: Certified defenses for adversarial patches. In: International Conference on Learning Representations (2019)

    Google Scholar 

  7. Demontis, A., et al.: Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks. In: USENIX Security, pp. 321–338 (2019)

    Google Scholar 

  8. Elboher, Y.Y., Gottschlich, J., Katz, G.: An abstraction-based framework for neural network verification. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020. LNCS, vol. 12224, pp. 43–65. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_3

    Chapter  MATH  Google Scholar 

  9. Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: Strip: A defence against trojan attacks on deep neural networks. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp. 113–125 (2019)

    Google Scholar 

  10. Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: Badnets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019). https://doi.org/10.1109/ACCESS.2019.2909068

    Article  Google Scholar 

  11. Guo, W., Wang, L., Xing, X., Du, M., Song, D.: Tabor: a highly accurate approach to inspecting and restoring trojan backdoors in AI systems. arXiv preprint arXiv:1908.01763 (2019)

  12. Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world images: the german traffic sign detection benchmark. In: International Joint Conference on Neural Networks, No. 1288 (2013)

    Google Scholar 

  13. Huang, X., et al: A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)

    Google Scholar 

  14. Katz, Guy, Katz, G.: The Marabou Framework for Verification and Analysis of Deep Neural Networks. In: Dillig, Isil, Tasiran, Serdar (eds.) CAV 2019. LNCS, vol. 11561, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26

    Chapter  Google Scholar 

  15. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  16. Liu, Y., et al.: Trojaning attack on neural networks. In: NDSS (2018)

    Google Scholar 

  17. Steinhardt, J., Koh, P.W.W., Liang, P.S.: Certified defenses for data poisoning attacks. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  18. Suciu, O., Marginean, R., Kaya, Y., Daume III, H., Dumitras, T.: When does machine learning \(\{\)FAIL\(\}\)? generalized transferability for evasion and poisoning attacks. In: USENIX Security (2018)

    Google Scholar 

  19. Usman, M. et al.: NNrepair: constraint-based repair of neural network classifiers. Computer Aided Verification , 3–25 (2021). https://doi.org/10.1007/978-3-030-81685-8_1

  20. Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youcheng Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, Y., Usman, M., Gopinath, D., Păsăreanu, C.S. (2022). VPN: Verification of Poisoning in Neural Networks. In: Isac, O., Ivanov, R., Katz, G., Narodytska, N., Nenzi, L. (eds) Software Verification and Formal Methods for ML-Enabled Autonomous Systems. NSV FoMLAS 2022 2022. Lecture Notes in Computer Science, vol 13466. Springer, Cham. https://doi.org/10.1007/978-3-031-21222-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21222-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21221-5

  • Online ISBN: 978-3-031-21222-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics