Abstract
Backdoor attacks insert hidden associations or triggers to the deep neural network (DNN) models to override correct inference such as classification. Such attacks perform maliciously according to the attacker-chosen target while behaving normally in the absence of the trigger. These attacks, though new, are rapidly evolving as a realistic attack, and could result in severe consequences, especially considering that backdoor attacks can be inserted in variety of real-world applications. This paper first provides a brief overview of backdoor attacks and then presents a countermeasure, STRong Intentional Perturbation (STRIP). STRIP intentionally perturbs the incoming input, for instance by superimposing various image patterns, and observes the randomness of predicted classes for perturbed inputs from a given deployed model – malicious or benign. STRIP fundamentally relies on the entropy in predicted classes; for example, a low entropy violates the input-dependence property of a benign model and implies the presence of a malicious input. We demonstrate the effectiveness of our method through experiments on two public datasets, MNIST and CIFAR10.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abuadbba, S., et al.: Can we use split learning on 1D CNN models for privacy preserving training? In: The 15th ACM ASIA Conference on Computer and Communications Security (AsiaCCS) (2020)
Bagdasaryan, E., Shmatikov, V.: Blind backdoors in deep learning models. arXiv preprint arXiv:2005.03823 (2020)
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 2938–2948 (2020). https://github.com/ebagdasa/backdoor_federated_learning
Bhagoji, A.N., Chakraborty, S., Mittal, P., Calo, S.: Analyzing federated learning through an adversarial lens. In: International Conference on Machine Learning (ICML), pp. 634–643 (2019)
Bonawitz, K., et al.: Towards federated learning at scale: system design. arXiv preprint arXiv:1902.01046 (2019)
Chen, H., Fu, C., Zhao, J., Koushanfar, F.: DeepInspect: a black-box Trojan detection and mitigation framework for deep neural networks. In: International Joint Conference on Artificial Intelligence, pp. 4658–4664 (2019)
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)
Codreanu, V., Podareanu, D., Saletore, V.: Scale out for large minibatch SGD: residual network training on Imagenet-1k with improved accuracy and reduced time to train. arXiv preprint arXiv:1711.04291 (2017)
Costales, R., Mao, C., Norwitz, R., Kim, B., Yang, J.: Live Trojan attacks on deep neural networks. arXiv preprint arXiv:2004.11370 (2020). https://github.com/robbycostales/live-trojans
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Freesound: Freesound dataset. https://annotator.freesound.org/. Accessed 14 July 2020
Gao, Y., et al.: Backdoor attacks and countermeasures on deep learning: a comprehensive review. arXiv preprint arXiv:2007.10760 (2020)
Gao, Y., et al.: End-to-end evaluation of federated learning and split learning for Internet of Things. In: The 39th International Symposium on Reliable Distributed Systems (SRDS) (2020)
Gao, Y., et al.: Design and evaluation of a multi-domain Trojan detection method on deep neural networks. arXiv preprint arXiv:1911.10312 (2019)
Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: STRIP: a defence against Trojan attacks on deep neural networks. In: Proceedings of the Annual Computer Security Applications Conference (ACSA), pp. 113–125 (2019). https://github.com/garrisongys/STRIP
Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., Wernsing, J.: CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: International Conference on Machine Learning, pp. 201–210 (2016)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733 (2017)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Jagielski, M., Severi, G., Harger, N.P., Oprea, A.: Subpopulation data poisoning attacks. arXiv preprint arXiv:2006.14026 (2020)
Ji, Y., Liu, Z., Hu, X., Wang, P., Zhang, Y.: Programmable neural network Trojan for pre-trained feature extractor. arXiv preprint arXiv:1901.07766 (2019)
Ji, Y., Zhang, X., Ji, S., Luo, X., Wang, T.: Model-reuse attacks on deep learning systems. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 349–363. ACM (2018)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report. Citeseer (2009)
Kurita, K., Michel, P., Neubig, G.: Weight poisoning attacks on pre-trained models. arXiv preprint arXiv:2004.06660 (2020). https://github.com/neulab/RIPPLe
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Liu, Y., Lee, W.C., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: The ACM Conference on Computer and Communications Security (CCS) (2019)
Liu, Y., et al.: Trojaning attack on neural networks. In: Network and Distributed System Security Symposium (NDSS) (2018)
Liu, Y., Xie, Y., Srivastava, A.: Neural Trojans. In: 2017 IEEE International Conference on Computer Design (ICCD), pp. 45–48. IEEE (2017)
Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38. IEEE (2017)
Mozilla: Common voice dataset. https://voice.mozilla.org/cnh/datasets. Accessed 14 July 2020
Nguyen, T.D., Rieger, P., Miettinen, M., Sadeghi, A.R.: Poisoning attacks on federated learning-based IoT intrusion detection system. In: NDSS Workshop on Decentralized IoT Systems and Security (2020)
Quiring, E., Rieck, K.: Backdooring and poisoning neural networks with image-scaling attacks. arXiv preprint arXiv:2003.08633 (2020). https://scaling-attacks.net/
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Ribeiro, M., Grolinger, K., Capretz, M.A.: MLaaS: machine learning as a service. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 896–902. IEEE (2015)
Schuster, R., Schuster, T., Meri, Y., Shmatikov, V.: Humpty dumpty: controlling word meanings via corpus poisoning. In: IEEE Symposium on Security and Privacy (SP) (2020)
Shafahi, A., et al.: Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 6103–6113 (2018). https://github.com/ashafahi/inceptionv3-transferLearn-poison
Sun, Z., Kairouz, P., Suresh, A.T., McMahan, H.B.: Can you really backdoor federated learning? arXiv preprint arXiv:1911.07963 (2019)
Tan, T.J.L., Shokri, R.: Bypassing backdoor detection algorithms in deep learning. In: IEEE European Symposium on Security and Privacy (EuroS&P) (2020)
Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M.: Deep learning approach for network intrusion detection in software defined networking. In: International Conference on Wireless Networks and Mobile Communications (WINCOM), pp. 258–263. IEEE (2016)
Veldanda, A.K., et al.: NNoculation: broad spectrum and targeted treatment of backdoored DNNs. arXiv preprint arXiv:2002.08313 (2020). https://github.com/akshajkumarv/NNoculation
Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018)
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: Proceedings of the IEEE Symposium on Security and Privacy (SP) (2019). https://github.com/bolunwang/backdoor
Wang, Q., et al.: Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1145–1153. ACM (2017)
Xiao, Q., Chen, Y., Shen, C., Chen, Y., Li, K.: Seeing is not believing: camouflage attacks on image scaling algorithms. In: \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 19), pp. 443–460 (2019). https://github.com/yfchen1994/scaling_camouflage
Xu, R., Joshi, J.B., Li, C.: CryptoNN: training neural networks over encrypted data. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pp. 1199–1209. IEEE (2019)
Yao, Y., Li, H., Zheng, H., Zhao, B.Y.: Latent backdoor attacks on deep neural networks. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2041–2055 (2019)
Zhou, C., Fu, A., Yu, S., Yang, W., Wang, H., Zhang, Y.: Privacy-preserving federated learning in fog computing. IEEE Internet Things J. 7, 10782–10793 (2020)
Zhu, C., et al.: Transferable clean-label poisoning attacks on deep neural nets. In: International Conference on Learning Representations (ICLR) (2019). https://github.com/zhuchen03/ConvexPolytopePosioning
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gao, Y., Nepal, S. (2020). A Defence Against Input-Agnostic Backdoor Attacks on Deep Neural Networks. In: Kanhere, S., Patil, V.T., Sural, S., Gaur, M.S. (eds) Information Systems Security. ICISS 2020. Lecture Notes in Computer Science(), vol 12553. Springer, Cham. https://doi.org/10.1007/978-3-030-65610-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-65610-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65609-6
Online ISBN: 978-3-030-65610-2
eBook Packages: Computer ScienceComputer Science (R0)