Network Intrusion Detection on Apache Spark with Machine Learning Algorithms

Kurt, Elif Merve; Becerikli, Yaşar

doi:10.1007/978-3-319-98204-5_11

Network Intrusion Detection on Apache Spark with Machine Learning Algorithms

Elif Merve Kurt^10,11 &
Yaşar Becerikli^10,12

Conference paper
First Online: 27 July 2018

908 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 893))

Abstract

The continuous increase in internet-based services makes network traffic data larger and more complex day by day. This makes it increasingly difficult to detect network attacks, and therefore requires more efficient and faster data processing methods to ensure network security. For this purpose, many intrusion detection systems have been developed and development works are continuing.

This study; by comparing the performance of machine learning algorithms on the same network data, aims to establish a reference source for the developed intrusion detection systems. In this study; all data of KDD Cup’99 were run on Logistic Regression, Support Vector Machine, Naive Bayes and Random Forest from machine learning algorithms using Apache Spark a big data technology; and the results were analyzed comparatively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Çevik, M.: Intrusion detection with pattern classification. Ph.D. thesis, Istanbul Technical University, Institute of Science and Technology (2005)
Google Scholar
Becerikli, Y.: Advanced pattern recognition. Doctorate Lecture, Computer Engineering Departmant, Kocaeli University, Kocaeli, Turkey (2016)
Google Scholar
Gupta, G.P., Kulariya, M.: A framework for fast and efficient cyber security network intrusion detection using apache spark. Procedia Comput. Sci. 93(Supplement C), 824–831 (2016)
Article Google Scholar
Siddique, K., Akhtar, Z., Lee, H.G., Kim, W., Kim, Y.: Toward bulk synchronous parallel-based machine learning techniques for anomaly detection in high-speed big data networks. Symmetry 9(9), 197 (2017)
Article Google Scholar
Harifi, S., Byagowi, E., Khalilian, M.: Comparative study of apache spark MLlib clustering algorithms. In: Tan, Y., Takagi, H., Shi, Y. (eds.) DMBD 2017. LNCS, vol. 10387, pp. 61–73. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61845-6_7
Chapter Google Scholar
Jeong, H.-D.J., et al.: A search for computationally efficient supervised learning algorithms of anomalous traffic. In: Barolli, L., Enokido, T. (eds.) IMIS 2017. AISC, vol. 612, pp. 590–600. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-61542-4_58
Chapter Google Scholar
Oh, S.W., Kim, H.S., Lee, H.S., Kim, S.J., Park, H., You, W.: Study on the multi-modal data preprocessing for knowledge-converged super brain. In: 2016 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1088–1093. IEEE (2016)
Google Scholar
Lightning-fast cluster computing. https://spark.apache.org/. Accessed 14 Mar 2018
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009, pp. 1–6. IEEE (2009)
Google Scholar
Intrusion Detector Learning. http://archive.ics.uci.edu/ml/machine-learning-databases/kddcup99-mld/task.html. Accessed 08 Jan 2018
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (2013). https://doi.org/10.1007/978-1-4757-3264-1
Book MATH Google Scholar
Özkan, Y.: Data Mining Methods. Papatya Publishing, Istanbul (2008)
Google Scholar
Osuna, E., Freund, R., Girosi, F.: Support Vector Machines: Training and Applications. Massachusetts Institute of Technology, Cambridge (1997)
Google Scholar
Pöyhönen, S.: Support vector machine based classification in condition monitoring of induction motors. Helsinki University of Technology (2004)
Google Scholar
Ilhan Omurca, S.: Machine learning. Master Lecture, Computer Engineering Departmant, Kocaeli University, Kocaeli, Turkey (2016)
Google Scholar
Akar, Ö., Güngör, O.: Classification of multispectral images using random forest algorithm. J. Geod. Geoinf. 1, 139–146 (2012)
Article Google Scholar
Özdarıcı Ok, A., Akar, Ö., Güngör, O.: Classification of crops in agricultural lands using random forest classification method. In: TUFUAB 2011 VI. Technical Symposium, Antalya, Turkey (2011)
Google Scholar
Gislason, P.O., Benediktsson, J.A., Sveinsson, J.R.: Random forests for land cover classification. Pattern Recogn. Lett. 27(4), 294–300 (2006)
Article Google Scholar
Pal, M.: Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005)
Article MathSciNet Google Scholar
Breiman, L.: Manual on setting up, using, and understanding random forests v3.1. Statistics Department, University of California Berkeley, CA, USA (2002)
Google Scholar
Archer, K.J., Kimes, R.V.: Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal. 52(4), 2249–2260 (2008)
Article MathSciNet Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Hasan, M.A.M., Nasser, M., Pal, B., Ahmad, S.: Support vector machine and random forest modeling for intrusion detection system (IDS). J. Intell. Learn. Syst. Appl. 06, 45–52 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Kocaeli University, Umuttepe Campus, 41380, Kocaeli, Turkey
Elif Merve Kurt & Yaşar Becerikli
Ctech Information Technology Incorporated Company, Teknopark Istanbul, Istanbul, Turkey
Elif Merve Kurt
Forensic Computing Department, Ankara Group Presidency, Council of Forensic Medicine, Ankara, Turkey
Yaşar Becerikli

Authors

Elif Merve Kurt
View author publications
You can also search for this author in PubMed Google Scholar
Yaşar Becerikli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Elif Merve Kurt or Yaşar Becerikli .

Editor information

Editors and Affiliations

University of the West of England, Bristol, United Kingdom
Elias Pimenidis
Oxford Brookes University, Oxford, United Kingdom
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kurt, E.M., Becerikli, Y. (2018). Network Intrusion Detection on Apache Spark with Machine Learning Algorithms. In: Pimenidis, E., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2018. Communications in Computer and Information Science, vol 893. Springer, Cham. https://doi.org/10.1007/978-3-319-98204-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-98204-5_11
Published: 27 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98203-8
Online ISBN: 978-3-319-98204-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics