A novel scalable intrusion detection system based on deep learning

Abstract

This paper successfully tackles the problem of processing a vast amount of security related data for the task of network intrusion detection. It employs Apache Spark, as a big data processing tool, for processing a large size of network traffic data. Also, we propose a hybrid scheme that combines the advantages of deep network and machine learning methods. Initially, stacked autoencoder network is used for latent feature extraction, which is followed by several classification-based intrusion detection methods, such as support vector machine, random forest, decision trees, and naive Bayes which are used for fast and efficient detection of intrusion in massive network traffic data. A real time UNB ISCX 2012 dataset is used to validate our proposed method and the performance is evaluated in terms of accuracy, f-measure, sensitivity, precision and time.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. 1.

    https://archive.ics.uci.edu/ml/datasets/Spambase.

  2. 2.

    http://archive.ics.uci.edu/ml/datasets/kdd+cup+1999+data.

  3. 3.

    http://www.unb.ca/cic/datasets/nsl.html.

  4. 4.

    http://www.takakura.com/kyoto_data/.

  5. 5.

    http://www.csmining.org/index.php/cdmc-2012.html.

  6. 6.

    \(\text {Apache Spark}^\mathrm{{TM}}\)—Lightning-Fast Cluster Computing, 2015, http://spark.apache.org/.

  7. 7.

    http://www.unb.ca/cic/datasets/ids-2017.html.

References

  1. 1.

    Abolhasanzadeh, B.: Nonlinear dimensionality reduction for intrusion detection using auto-encoder bottleneck features. In: 2015 7th Conference on Information and Knowledge Technology (IKT), pp. 1–5. IEEE (2015)

  2. 2.

    Aljawarneh, S., Aldwairi, M., Yassein, M.B.: Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J. Comput. Sci. 25, 152–160 (2017)

    Article  Google Scholar 

  3. 3.

    Alom, Md.Z., Bontupalli, V., Taha, T.M.: Intrusion detection using deep belief networks. In: 2015 National Aerospace and Electronics Conference (NAECON), pp. 339–344. IEEE (2015)

  4. 4.

    Benaicha, S.E., Saoudi, L., Guermeche, S.E.B., Lounis, O.: Intrusion detection system using genetic algorithm. In: Science and Information Conference (SAI), pp. 564–568. IEEE (2014)

  5. 5.

    Bijone, M.: A survey on secure network: intrusion detection & prevention approaches. Am. J. Inf. Syst. 4(3), 69–88 (2016)

    Google Scholar 

  6. 6.

    Brown, J., Anwar, M., Dozier, G.: Intrusion detection using a multiple-detector set artificial immune system. In: 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 283–286. IEEE (2016)

  7. 7.

    Chitrakar, R., Huang, C.: Selection of candidate support vectors in incremental SVM for network intrusion detection. Comput. Secur. 45, 231–241 (2014)

    Article  Google Scholar 

  8. 8.

    Deshmukh, D.H., Ghorpade, T., Padiya, P.: Intrusion detection system by improved preprocessing methods and naïve bayes classifier using NSL-KDD 99 dataset. In: 2014 International Conference on Electronics and Communication Systems (ICECS), pp. 1–7. IEEE (2014)

  9. 9.

    Dong, B., Wang, X.: Comparison deep learning method to traditional methods using for network intrusion detection. In: 2016 8th IEEE International Conference on Communication Software and Networks (ICCSN), pp. 581–585 (2016)

  10. 10.

    El-Alfy, E.-S.M., Alshammari, M.A.: Towards scalable rough set based attribute subset selection for intrusion detection using parallel genetic algorithm in mapreduce. Simul. Model. Pract. Theory 64, 18–29 (2016)

    Article  Google Scholar 

  11. 11.

    Essid, M., Jemili, F.: Combining intrusion detection datasets using Mapreduce. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 4724–4728. IEEE (2016)

  12. 12.

    Fiore, U., Palmieri, F., Castiglione, A., De Santis, A.: Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122, 13–23 (2013)

    Article  Google Scholar 

  13. 13.

    Gao, N., Gao, L., Gao, Q., Wang, H.: An intrusion detection model based on deep belief networks. In: 2014 Second International Conference on Advanced Cloud and Big Data (CBD), pp. 247–252. IEEE (2014)

  14. 14.

    Gouveia, A., Correia, M.: Feature set tuning in statistical learning network intrusion detection. In: 2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), pp. 68–75. IEEE (2016)

  15. 15.

    Gouveia, A., Correia, M.: A systematic approach for the application of restricted Boltzmann machines in network intrusion detection. In: International Work-Conference on Artificial Neural Networks, Vol. 10305, pp. 432–446. Springer, Berlin (2017)

  16. 16.

    Gupta, G.P., Kulariya, M.: A framework for fast and efficient cyber security network intrusion detection using Apache Spark. Procedia Comput. Sci. 93, 824–831 (2016)

    Article  Google Scholar 

  17. 17.

    Han, L.: Using a dynamic k-means algorithm to detect anomaly activities. In: 2011 Seventh International Conference on Computational Intelligence and Security (CIS)

  18. 18.

    Heaton, J., Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. Genet. Program. Evolvable. Mach. 19(1–2), 305–307 (2018)

    Article  Google Scholar 

  19. 19.

    Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., Atkinson, R.: Shallow and deep networks intrusion detection system: a taxonomy and survey. CoRR, arXiv:1701.02145 (2017)

  20. 20.

    Information and Irvine Computer Science University of California: KDD Cup 1999 Data. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (1999)

  21. 21.

    Jakkula, V.: Tutorial on Support Vector Machine (SVM), p. 37. School of EECS, Washington State University (2006)

  22. 22.

    Kato, K., Klyuev, V.: Development of a network intrusion detection system using apache Hadoop and spark. In: 2017 IEEE Conference on Dependable and Secure Computing, pp. 416–423. IEEE (2017)

  23. 23.

    Kim, J., Kim, J., Thu, H.L.T., Kim, H.: Long short term memory recurrent neural network classifier for intrusion detection. In: 2016 International Conference on Platform Technology and Service (PlatCon), pp. 1–5. IEEE (2016)

  24. 24.

    Kuang, F., Weihong, X., Zhang, S.: A novel hybrid KPCA and SVM with GA model for intrusion detection. Appl. Soft Comput. 18, 178–184 (2014)

    Article  Google Scholar 

  25. 25.

    Kulariya, M., Saraf, P., Ranjan, R., Gupta, G.P.: Performance analysis of network intrusion detection schemes using Apache Spark. In: 2016 International Conference on Communication and Signal Processing (ICCSP), pp. 1973–1977. IEEE (2016)

  26. 26.

    Laney, D.: 3d data management: controlling data volume, velocity and variety. META Group Res. Note 6(70), 1 (2001)

    Google Scholar 

  27. 27.

    Li, Y., Ma, R., Jiao, R.: A hybrid malicious code detection method based on deep learning. Methods 9(5), 205–216 (2015)

    Google Scholar 

  28. 28.

    Li, Z., Li, Y., Xu, L.: Anomaly intrusion detection method based on k-means clustering algorithm with particle swarm optimization. In: 2011 International Conference on Information Technology, Computer Engineering and Management Sciences (ICM), Vol. 2, pp. 157–161. IEEE (2011)

  29. 29.

    Masarat, S., Taheri, H., Sharifian, S.: A novel framework, based on fuzzy ensemble of classifiers for intrusion detection systems. In: 2014 4th International eConference on Computer and Knowledge Engineering (ICCKE), pp. 165–170. IEEE (2014)

  30. 30.

    Muda, Z., Yassin, W., Sulaiman, M.N., Udzir, N.I.: Intrusion detection based on k-means clustering and naïve bayes classification. In: 2011 7th International Conference on Information Technology in Asia (CITA 11), pp. 1–6. IEEE (2011)

  31. 31.

    Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection: support vector machines and neural networks. In: Proceedings of the IEEE International Joint Conference on Neural Networks (ANNIE), St. Louis, MO, pp. 1702–1707 (2002)

  32. 32.

    Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A., Brown, S.D.: An introduction to decision tree modeling. J. Chemom. A J. Chemom. Soc. 18(6), 275–285 (2004)

    Google Scholar 

  33. 33.

    Mighan, S.N., Kahani, M.: Deep learning based latent feature extraction for intrusion detection. In: 26th Iranian Conference on Electrical Engineering (ICEE2018) (2018)

  34. 34.

    Nazari, Z., Noferesti, M., Jalili, R.: DSCA: an inline and adaptive application identification approach in encrypted network traffic. In: Proceedings of the 3rd International Conference on Cryptography, Security and Privacy, pp. 39–43. ACM (2019)

  35. 35.

    Rathore, M.M., Ahmad, A., Paul, A.: Real time intrusion detection system for ultra-high-speed big data environments. J. Supercomput. 72(9), 3489–3510 (2016)

    Article  Google Scholar 

  36. 36.

    Salama, M.A., Eid, H.F., Ramadan, R.A., Darwish, A., Hassanien, A.E.: Hybrid intelligent intrusion detection scheme. In: Gaspar-Cunha, A., Takahashi, R., Schaefer, G., Costa, L. (eds.) Soft Computing in Industrial Applications. Advances in Intelligent and Soft Computing, vol. 96, pp. 293–303. Springer, Berlin (2011)

    Google Scholar 

  37. 37.

    Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)

    Article  Google Scholar 

  38. 38.

    Siddique, K., Akhtar, Z., Lee, H., Kim, W., Kim, Y.: Toward bulk synchronous parallel-based machine learning techniques for anomaly detection in high-speed big data networks. Symmetry 9(9), 197 (2017)

    Article  Google Scholar 

  39. 39.

    Soheily-Khah, S., Marteau, P.-F., échet, N.: Intrusion detection in network systems through hybrid supervised and unsupervised mining process a detailed case study on the ISCX benchmark dataset. In: 2018 1st International Conference on Data Intelligence and Security (ICDIS). IEEE (2017)

  40. 40.

    Stallings, W.: Cryptography and Network Security: Principles and Practice. Pearson, Upper Saddle River (2017)

    Google Scholar 

  41. 41.

    Thaseen, I.S., Kumar, Ch.A.: Intrusion detection model using fusion of PCA and optimized SVM. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I), pp. 879–884. IEEE (2014)

  42. 42.

    UNB-ISCX: NSL KDD Dataset. http://www.unb.ca/research/iscx/dataset/iscx-NSL-KDD-dataset.html (2009)

  43. 43.

    Wang, B., Zheng, Y., Lou, W., Hou, Y.T.: Ddos attack protection in the era of cloud computing and software-defined networking. Comput. Netw. 81, 308–319 (2015)

    Article  Google Scholar 

  44. 44.

    Wang, Y., Cai, W., Wei, P.: A deep learning approach for detecting malicious Javascript code. Secur. Commun. Netw. 9(11), 1520–1534 (2016)

    Article  Google Scholar 

  45. 45.

    Wang, Y., Yao, H., Zhao, S.: Auto-encoder based dimensionality reduction. Neurocomputing 184, 232–242 (2016)

    Article  Google Scholar 

  46. 46.

    Wang, Z.: The Applications of Deep Learning on Traffic Identification. BlackHat USA (2015)

  47. 47.

    Watson, G.: A Comparison of Header and Deep Packet Features When Detecting Network Intrusions. Technical Report (2018)

  48. 48.

    Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010)

    Article  Google Scholar 

  49. 49.

    Zikopoulos, P., Deroos, D., Parasuraman, K., Deutsch, T., Giles, J., Corrigan, D.: Harness the Power of Big Data: The IBM Big Data Platform. McGraw-Hill, New York (2013)

    Google Scholar 

  50. 50.

    Zuech, R., Khoshgoftaar, T.M., Wald, R.: Intrusion detection and big heterogeneous data: a survey. J. Big Data 2(1), 3 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Mr. Behdad Behmadi for his contribution to the English copy editing of this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mohsen Kahani.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mighan, S.N., Kahani, M. A novel scalable intrusion detection system based on deep learning. Int. J. Inf. Secur. (2020). https://doi.org/10.1007/s10207-020-00508-5

Download citation

Keywords

  • Apache Spark
  • Stacked autoencoder
  • Latent
  • Accuracy
  • ISCX
  • Intrusion detection