Skip to main content

Using MongoDB Databases for Training and Combining Intrusion Detection Datasets

  • Chapter
  • First Online:
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD 2017)

Abstract

A single source of intrusion detection dataset involves the analyze of Big Data, recent attempts focus on Big Data techniques in order to combine heterogeneous data sets and solve the problems of analyzing the huge amounts of data. The main objective of this paper is to present a method to train and combine several datasets from semi-structured sources with the MapReduce programming paradigm under MongoDB. It aims to increase the intrusion detection rates. In our work, we will focus on KDD99, DARPA 1998 and DARPA 1999 dataset and with the big data technique MapReduce in MongoDB: First, we will select the most pertinent attributes and eliminate the redundancies from the previous datasets. Then, we will merge them vertically into the same collection. Finally, to analyze the dataset we will use a Bayesian network as K2 algorithm implemented in WEKA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shanmugavadivu, R., Nagarajan, N.: Network intrusion detection system using fuzzy logic. Indian J. Comput. Sci. Eng. (IJCSE) 2(1), 101–111 (2011)

    Google Scholar 

  2. Zekri, M., Meslati, L.S.: Immunological approach for intrusion detection. ARIMA J. 17, 221–240 (2014)

    Google Scholar 

  3. Nadjaran Toosi, A., Kahani, M., Monsefi, R.: Network intrusion detection based on neuro-fuzzy classification. In: International Conference on Computing and Informatics, 2006. ICOCI ’06, Kuala Lumpur, June 2006 (2006)

    Google Scholar 

  4. Hoque, M.S., Mukit, M.A., Bikas, A.N.: An implementation of intrusion detection system using genetic algorithm. Int. J. Netw. Secur. Appl. (IJNSA) 4(2), 109–120 (2012)

    Google Scholar 

  5. Chickowski, E.: A case study in security big data analysis. http://www.darkreading.com/analytics/security-monitoring/a-case-study-in-security-big-data-analysis/d/d-id/1137299 (2012). Accessed Oct 2016

  6. Richard, Z.T.M.Kh., Wald, R.: Intrusions detection and big heterogeneous data: a survey. J. Big Data 1, Article 115 (2015)

    Google Scholar 

  7. Azad, C., Jha, V.K.: Data mining in intrusion detection: a comparative study of methods, types and data sets. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 75–90 (2013)

    Google Scholar 

  8. KDD Cup 1999 Data. http://kdd.ics.uci.edu/databases/kddcup99/. Accessed mars 2016

  9. DARPA1998 Data. https://www.ll.mit.edu/ideval/data/1998data.html. Accessed mars 2016

  10. DARPA1999 DATA. https://web.cs.dal.ca/~riyad/Site/Download.html. Accessed mars 2016

  11. Lee, C.: An evaluation of machine learning techniques in intrusions detection. Doctoral Thesis in Computer Science, University of Vanderbilt, P6 (2007)

    Google Scholar 

  12. Dehbi, O.R., Talea, M., Batouta, Z.I.: An advanced comparative study of the most promising NoSQL and NewSQL databases with a multi-criteria analysis method. J. Theoret. Appl. Inf. Technol. 81(3) (2015)

    Google Scholar 

  13. DB-Engines Ranking. http://db-engines.com/en/ranking. Accessed May 2016

  14. MongoDB database. https://docs.mongodb.com/manual/core/map-reduce/. Accessed Sept 2016

  15. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, Dec 2004

    Google Scholar 

  16. Wei, W., Gombault, S., Guyet, T.: Towards fast detecting intrusions: using key attributes of network traffic. In: The Third International Conference on Internet Monitoring and Protection, Bucharest, vol. 13, pp. 86–91 (2008)

    Google Scholar 

  17. Al-Mamory, S.O., Jassim, F.S.: Evaluation of different data mining algorithms with KDD CUP 99 Data Set. J. Babylon. 21, 2663–2681 (2013)

    Google Scholar 

  18. Zargar, G., Kabiri, P.: Identification of effective network features to detect Smurf attacks. In: Research and Development, SCOReD, p. 185 (2009)

    Google Scholar 

  19. Szabo, G.: Methods for efficient classification of network traffic. Thesis, Budapest University of Technology and Economics, p. 10 (2010)

    Google Scholar 

  20. Introduction to Weka. http://www.iasri.res.in/ebook/win_school_aa/notes/WEKA.pdf. Accessed 2016

  21. Hernandez, J., Zarate, P., Dargam, F.: Decision Support Systems—Collaborative Models and Approaches in Real Environments, p. 61 (2011)

    Google Scholar 

  22. Jemili, F., Essid, M.: Combining intrusion detection datasets using MapReduce. In: The International Conference on Systems, Man, and Cybernetics (2016)

    Google Scholar 

  23. Jemili, F., Zaghdoud, M., Ahmed, M.B.: A framework for an adaptive intrusion detection system using Bayesian network. In: The IEEE International Conference on Intelligence and Security Informatics, USA, 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Marwa Elayni or Farah Jemili .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Elayni, M., Jemili, F. (2018). Using MongoDB Databases for Training and Combining Intrusion Detection Datasets. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. SNPD 2017. Studies in Computational Intelligence, vol 721. Springer, Cham. https://doi.org/10.1007/978-3-319-62048-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62048-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62047-3

  • Online ISBN: 978-3-319-62048-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics