A Defence Against Input-Agnostic Backdoor Attacks on Deep Neural Networks

Gao, Yansong; Nepal, Surya

doi:10.1007/978-3-030-65610-2_4

Yansong Gao¹² &
Surya Nepal^12,13

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12553))

Included in the following conference series:

International Conference on Information Systems Security

962 Accesses
2 Citations

Abstract

Backdoor attacks insert hidden associations or triggers to the deep neural network (DNN) models to override correct inference such as classification. Such attacks perform maliciously according to the attacker-chosen target while behaving normally in the absence of the trigger. These attacks, though new, are rapidly evolving as a realistic attack, and could result in severe consequences, especially considering that backdoor attacks can be inserted in variety of real-world applications. This paper first provides a brief overview of backdoor attacks and then presents a countermeasure, STRong Intentional Perturbation (STRIP). STRIP intentionally perturbs the incoming input, for instance by superimposing various image patterns, and observes the randomness of predicted classes for perturbed inputs from a given deployed model – malicious or benign. STRIP fundamentally relies on the entropy in predicted classes; for example, a low entropy violates the input-dependence property of a benign model and implies the presence of a malicious input. We demonstrate the effectiveness of our method through experiments on two public datasets, MNIST and CIFAR10.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abuadbba, S., et al.: Can we use split learning on 1D CNN models for privacy preserving training? In: The 15th ACM ASIA Conference on Computer and Communications Security (AsiaCCS) (2020)
Google Scholar
Bagdasaryan, E., Shmatikov, V.: Blind backdoors in deep learning models. arXiv preprint arXiv:2005.03823 (2020)
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 2938–2948 (2020). https://github.com/ebagdasa/backdoor_federated_learning
Bhagoji, A.N., Chakraborty, S., Mittal, P., Calo, S.: Analyzing federated learning through an adversarial lens. In: International Conference on Machine Learning (ICML), pp. 634–643 (2019)
Google Scholar
Bonawitz, K., et al.: Towards federated learning at scale: system design. arXiv preprint arXiv:1902.01046 (2019)
Chen, H., Fu, C., Zhao, J., Koushanfar, F.: DeepInspect: a black-box Trojan detection and mitigation framework for deep neural networks. In: International Joint Conference on Artificial Intelligence, pp. 4658–4664 (2019)
Google Scholar
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)
Codreanu, V., Podareanu, D., Saletore, V.: Scale out for large minibatch SGD: residual network training on Imagenet-1k with improved accuracy and reduced time to train. arXiv preprint arXiv:1711.04291 (2017)
Costales, R., Mao, C., Norwitz, R., Kim, B., Yang, J.: Live Trojan attacks on deep neural networks. arXiv preprint arXiv:2004.11370 (2020). https://github.com/robbycostales/live-trojans
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Freesound: Freesound dataset. https://annotator.freesound.org/. Accessed 14 July 2020
Gao, Y., et al.: Backdoor attacks and countermeasures on deep learning: a comprehensive review. arXiv preprint arXiv:2007.10760 (2020)
Gao, Y., et al.: End-to-end evaluation of federated learning and split learning for Internet of Things. In: The 39th International Symposium on Reliable Distributed Systems (SRDS) (2020)
Google Scholar
Gao, Y., et al.: Design and evaluation of a multi-domain Trojan detection method on deep neural networks. arXiv preprint arXiv:1911.10312 (2019)
Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: STRIP: a defence against Trojan attacks on deep neural networks. In: Proceedings of the Annual Computer Security Applications Conference (ACSA), pp. 113–125 (2019). https://github.com/garrisongys/STRIP
Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., Wernsing, J.: CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: International Conference on Machine Learning, pp. 201–210 (2016)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733 (2017)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Jagielski, M., Severi, G., Harger, N.P., Oprea, A.: Subpopulation data poisoning attacks. arXiv preprint arXiv:2006.14026 (2020)
Ji, Y., Liu, Z., Hu, X., Wang, P., Zhang, Y.: Programmable neural network Trojan for pre-trained feature extractor. arXiv preprint arXiv:1901.07766 (2019)
Ji, Y., Zhang, X., Ji, S., Luo, X., Wang, T.: Model-reuse attacks on deep learning systems. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 349–363. ACM (2018)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report. Citeseer (2009)
Google Scholar
Kurita, K., Michel, P., Neubig, G.: Weight poisoning attacks on pre-trained models. arXiv preprint arXiv:2004.06660 (2020). https://github.com/neulab/RIPPLe
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Liu, Y., Lee, W.C., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: The ACM Conference on Computer and Communications Security (CCS) (2019)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: Network and Distributed System Security Symposium (NDSS) (2018)
Google Scholar
Liu, Y., Xie, Y., Srivastava, A.: Neural Trojans. In: 2017 IEEE International Conference on Computer Design (ICCD), pp. 45–48. IEEE (2017)
Google Scholar
Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38. IEEE (2017)
Google Scholar
Mozilla: Common voice dataset. https://voice.mozilla.org/cnh/datasets. Accessed 14 July 2020
Nguyen, T.D., Rieger, P., Miettinen, M., Sadeghi, A.R.: Poisoning attacks on federated learning-based IoT intrusion detection system. In: NDSS Workshop on Decentralized IoT Systems and Security (2020)
Google Scholar
Quiring, E., Rieck, K.: Backdooring and poisoning neural networks with image-scaling attacks. arXiv preprint arXiv:2003.08633 (2020). https://scaling-attacks.net/
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Ribeiro, M., Grolinger, K., Capretz, M.A.: MLaaS: machine learning as a service. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 896–902. IEEE (2015)
Google Scholar
Schuster, R., Schuster, T., Meri, Y., Shmatikov, V.: Humpty dumpty: controlling word meanings via corpus poisoning. In: IEEE Symposium on Security and Privacy (SP) (2020)
Google Scholar
Shafahi, A., et al.: Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 6103–6113 (2018). https://github.com/ashafahi/inceptionv3-transferLearn-poison
Sun, Z., Kairouz, P., Suresh, A.T., McMahan, H.B.: Can you really backdoor federated learning? arXiv preprint arXiv:1911.07963 (2019)
Tan, T.J.L., Shokri, R.: Bypassing backdoor detection algorithms in deep learning. In: IEEE European Symposium on Security and Privacy (EuroS&P) (2020)
Google Scholar
Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M.: Deep learning approach for network intrusion detection in software defined networking. In: International Conference on Wireless Networks and Mobile Communications (WINCOM), pp. 258–263. IEEE (2016)
Google Scholar
Veldanda, A.K., et al.: NNoculation: broad spectrum and targeted treatment of backdoored DNNs. arXiv preprint arXiv:2002.08313 (2020). https://github.com/akshajkumarv/NNoculation
Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018)
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: Proceedings of the IEEE Symposium on Security and Privacy (SP) (2019). https://github.com/bolunwang/backdoor
Wang, Q., et al.: Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1145–1153. ACM (2017)
Google Scholar
Xiao, Q., Chen, Y., Shen, C., Chen, Y., Li, K.: Seeing is not believing: camouflage attacks on image scaling algorithms. In: \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 19), pp. 443–460 (2019). https://github.com/yfchen1994/scaling_camouflage
Xu, R., Joshi, J.B., Li, C.: CryptoNN: training neural networks over encrypted data. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pp. 1199–1209. IEEE (2019)
Google Scholar
Yao, Y., Li, H., Zheng, H., Zhao, B.Y.: Latent backdoor attacks on deep neural networks. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2041–2055 (2019)
Google Scholar
Zhou, C., Fu, A., Yu, S., Yang, W., Wang, H., Zhang, Y.: Privacy-preserving federated learning in fog computing. IEEE Internet Things J. 7, 10782–10793 (2020)
Article Google Scholar
Zhu, C., et al.: Transferable clean-label poisoning attacks on deep neural nets. In: International Conference on Learning Representations (ICLR) (2019). https://github.com/zhuchen03/ConvexPolytopePosioning

Download references

Author information

Authors and Affiliations

Data61, CSIRO, Syndey, Australia
Yansong Gao & Surya Nepal
Cyber Security Cooperative Research Centre, Joondalup, Australia
Surya Nepal

Authors

Yansong Gao
View author publications
You can also search for this author in PubMed Google Scholar
Surya Nepal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Surya Nepal .

Editor information

Editors and Affiliations

UNSW Sydney, Sydney, NSW, Australia
Salil Kanhere
IIT Bombay, Mumbai, India
Vishwas T Patil
IIT Kharagpur, Kharagpur, West Bengal, India
Shamik Sural
IIT Jammu, Jammu, India
Manoj S Gaur

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, Y., Nepal, S. (2020). A Defence Against Input-Agnostic Backdoor Attacks on Deep Neural Networks. In: Kanhere, S., Patil, V.T., Sural, S., Gaur, M.S. (eds) Information Systems Security. ICISS 2020. Lecture Notes in Computer Science(), vol 12553. Springer, Cham. https://doi.org/10.1007/978-3-030-65610-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-65610-2_4
Published: 06 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65609-6
Online ISBN: 978-3-030-65610-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics