Abstract
A single source of intrusion detection dataset involves the analyze of Big Data, recent attempts focus on Big Data techniques in order to combine heterogeneous data sets and solve the problems of analyzing the huge amounts of data. The main objective of this paper is to present a method to train and combine several datasets from semi-structured sources with the MapReduce programming paradigm under MongoDB. It aims to increase the intrusion detection rates. In our work, we will focus on KDD99, DARPA 1998 and DARPA 1999 dataset and with the big data technique MapReduce in MongoDB: First, we will select the most pertinent attributes and eliminate the redundancies from the previous datasets. Then, we will merge them vertically into the same collection. Finally, to analyze the dataset we will use a Bayesian network as K2 algorithm implemented in WEKA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shanmugavadivu, R., Nagarajan, N.: Network intrusion detection system using fuzzy logic. Indian J. Comput. Sci. Eng. (IJCSE) 2(1), 101–111 (2011)
Zekri, M., Meslati, L.S.: Immunological approach for intrusion detection. ARIMA J. 17, 221–240 (2014)
Nadjaran Toosi, A., Kahani, M., Monsefi, R.: Network intrusion detection based on neuro-fuzzy classification. In: International Conference on Computing and Informatics, 2006. ICOCI ’06, Kuala Lumpur, June 2006 (2006)
Hoque, M.S., Mukit, M.A., Bikas, A.N.: An implementation of intrusion detection system using genetic algorithm. Int. J. Netw. Secur. Appl. (IJNSA) 4(2), 109–120 (2012)
Chickowski, E.: A case study in security big data analysis. http://www.darkreading.com/analytics/security-monitoring/a-case-study-in-security-big-data-analysis/d/d-id/1137299 (2012). Accessed Oct 2016
Richard, Z.T.M.Kh., Wald, R.: Intrusions detection and big heterogeneous data: a survey. J. Big Data 1, Article 115 (2015)
Azad, C., Jha, V.K.: Data mining in intrusion detection: a comparative study of methods, types and data sets. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 75–90 (2013)
KDD Cup 1999 Data. http://kdd.ics.uci.edu/databases/kddcup99/. Accessed mars 2016
DARPA1998 Data. https://www.ll.mit.edu/ideval/data/1998data.html. Accessed mars 2016
DARPA1999 DATA. https://web.cs.dal.ca/~riyad/Site/Download.html. Accessed mars 2016
Lee, C.: An evaluation of machine learning techniques in intrusions detection. Doctoral Thesis in Computer Science, University of Vanderbilt, P6 (2007)
Dehbi, O.R., Talea, M., Batouta, Z.I.: An advanced comparative study of the most promising NoSQL and NewSQL databases with a multi-criteria analysis method. J. Theoret. Appl. Inf. Technol. 81(3) (2015)
DB-Engines Ranking. http://db-engines.com/en/ranking. Accessed May 2016
MongoDB database. https://docs.mongodb.com/manual/core/map-reduce/. Accessed Sept 2016
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, Dec 2004
Wei, W., Gombault, S., Guyet, T.: Towards fast detecting intrusions: using key attributes of network traffic. In: The Third International Conference on Internet Monitoring and Protection, Bucharest, vol. 13, pp. 86–91 (2008)
Al-Mamory, S.O., Jassim, F.S.: Evaluation of different data mining algorithms with KDD CUP 99 Data Set. J. Babylon. 21, 2663–2681 (2013)
Zargar, G., Kabiri, P.: Identification of effective network features to detect Smurf attacks. In: Research and Development, SCOReD, p. 185 (2009)
Szabo, G.: Methods for efficient classification of network traffic. Thesis, Budapest University of Technology and Economics, p. 10 (2010)
Introduction to Weka. http://www.iasri.res.in/ebook/win_school_aa/notes/WEKA.pdf. Accessed 2016
Hernandez, J., Zarate, P., Dargam, F.: Decision Support Systems—Collaborative Models and Approaches in Real Environments, p. 61 (2011)
Jemili, F., Essid, M.: Combining intrusion detection datasets using MapReduce. In: The International Conference on Systems, Man, and Cybernetics (2016)
Jemili, F., Zaghdoud, M., Ahmed, M.B.: A framework for an adaptive intrusion detection system using Bayesian network. In: The IEEE International Conference on Intelligence and Security Informatics, USA, 2007 (2007)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Elayni, M., Jemili, F. (2018). Using MongoDB Databases for Training and Combining Intrusion Detection Datasets. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. SNPD 2017. Studies in Computational Intelligence, vol 721. Springer, Cham. https://doi.org/10.1007/978-3-319-62048-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-62048-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62047-3
Online ISBN: 978-3-319-62048-0
eBook Packages: EngineeringEngineering (R0)