Studying Machine Learning Techniques for Intrusion Detection Systems

  • Quang-Vinh DangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11814)


Intrusion detection systems (IDSs) have been studied widely in the computer security community for a long time. The recent development of machine learning techniques has boosted the performance of the intrusion detection systems significantly. However, most modern machine learning and deep learning algorithms are exhaustive of labeled data that requires a lot of time and effort to collect. Furthermore, it might be late until all the data is collected to train the model.

In this study, we first perform a comprehensive survey of existing studies on using machine learning for IDSs. Hence we present two approaches to detect the network attacks. We present that by using a tree-based ensemble learning with feature engineering we can outperform state-of-the-art results in the field. We also present a new approach in selecting training data for IDSs hence by using a small subset of training data combined with some weak classification algorithms we can improve the performance of the detector while maintaining the low running cost.


Intrusion Detection System Machine learning Classification 


  1. 1.
    Aggarwal, C.C.: Outlier Analysis, 2nd edn. Springer, New York (2017). Scholar
  2. 2.
    Ahmed, M., Mahmood, A.N., Hu, J.: A survey of network anomaly detection techniques. J. Netw. Comput. Appl. 60, 19–31 (2016)CrossRefGoogle Scholar
  3. 3.
    Amor, N.B., Benferhat, S., Elouedi, Z.: Naive bayes vs decision trees in intrusion detection systems. In: SAC, pp. 420–424. ACM (2004)Google Scholar
  4. 4.
    Anderson, J.P.: Computer Security Threat Monitoring and Surveillance. James p. Anderson Co., Fort Washington (1980)Google Scholar
  5. 5.
    Bhamare, D., Salman, T., Samaka, M., Erbad, A., Jain, R.: Feasibility of supervised machine learning for cloud security. CoRR abs/1810.09878 (2018)Google Scholar
  6. 6.
    Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 16(1), 303–336 (2014)CrossRefGoogle Scholar
  7. 7.
    Blum, A., Hopcroft, J., Kannan, R.: Foundations of data science. Vorabversion eines Lehrbuchs (2016)Google Scholar
  8. 8.
    Boriah, S., Chandola, V., Kumar, V.: Similarity measures for categorical data: a comparative evaluation. In: SDM, pp. 243–254. SIAM (2008)Google Scholar
  9. 9.
    Cha, S.H.: Comprehensive survey on distance/similarity measures between probability density functions. Int. J. Math. Models Meth. Appl. Sci. 1(2), 1 (2007)Google Scholar
  10. 10.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009)CrossRefGoogle Scholar
  11. 11.
    Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, Heidelberg (2009). Scholar
  12. 12.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: KDD, pp. 785–794. ACM (2016)Google Scholar
  13. 13.
    Corporation ID: Worldwide semiannual security spending guide, March 2019Google Scholar
  14. 14.
    Dang, Q.: Outlier detection on network flow analysis. CoRR abs/1808.02024 (2018)Google Scholar
  15. 15.
    Dang, Q.: Trust assessment in large-scale collaborative systems. (Évaluation de la confiance dans la collaboration à large échelle). Ph.D. thesis, University of Lorraine, Nancy, France (2018)Google Scholar
  16. 16.
    Dang, Q., Ignat, C.: Measuring quality of collaboratively edited documents: the case of wikipedia. In: CIC, pp. 266–275. IEEE Computer Society (2016)Google Scholar
  17. 17.
    Dang, Q., Ignat, C.: An end-to-end learning solution for assessing the quality of wikipedia articles. In: OpenSym, pp. 4:1–4:10. ACM (2017)Google Scholar
  18. 18.
    Dang, Q., Ignat, C.: Link-sign prediction in dynamic signed directed networks. In: CIC, pp. 36–45. IEEE Computer Society (2018)Google Scholar
  19. 19.
    Diro, A.A., Chilamkurti, N.: Distributed attack detection scheme using deep learning approach for internet of things. Future Gener. Comput. Syst. 82, 761–768 (2018)CrossRefGoogle Scholar
  20. 20.
    Eskin, E.: Anomaly detection over noisy data using learned probability distributions. In: ICML, pp. 255–262. Morgan Kaufmann (2000)Google Scholar
  21. 21.
    Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)MathSciNetCrossRefGoogle Scholar
  22. 22.
    He, Z., Zhang, T., Lee, R.B.: Machine learning based ddos attack detection from source side in cloud. In: CSCloud, pp. 114–120. IEEE Computer Society (2017)Google Scholar
  23. 23.
    Horng, S., et al.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst. Appl. 38(1), 306–313 (2011)CrossRefGoogle Scholar
  24. 24.
    Jyothsna, V., Prasad, V.R., Prasad, K.M.: A review of anomaly based intrusion detection systems. Int. J. Comput. Appl. 28(7), 26–35 (2011)Google Scholar
  25. 25.
    Kaspersky: The Kaspersky Lab DDoS Q4 Report (2019)Google Scholar
  26. 26.
    Kruegel, C., Toth, T.: Using decision trees to improve signature-based intrusion detection. In: Vigna, G., Kruegel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 173–191. Springer, Heidelberg (2003). Scholar
  27. 27.
    Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS, pp. 6402–6413 (2017)Google Scholar
  28. 28.
    Li, X., Ye, N.: Decision tree classifiers for computer intrusion detection. J. Parallel Distrib. Comput. Pract. 4(2), 179–190 (2001)MathSciNetGoogle Scholar
  29. 29.
    Liu, F.T., Ting, K.M., Zhou, Z.: Isolation forest. In: ICDM, pp. 413–422. IEEE Computer Society (2008)Google Scholar
  30. 30.
    Lu, W., Traore, I.: Detecting new forms of network intrusion using genetic programming. Comput. Intell. 20(3), 475–494 (2004)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Mahoney, M.V., Chan, P.K.: An analysis of the 1999 DARPA/Lincoln laboratory evaluation data for network anomaly detection. In: Vigna, G., Kruegel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 220–237. Springer, Heidelberg (2003). Scholar
  32. 32.
    McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Trans. Inf. Syst. Secur. (TISSEC) 3(4), 262–294 (2000)CrossRefGoogle Scholar
  33. 33.
    Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D.: Evaluating computer intrusion detection systems: a survey of common practices. ACM Comput. Surv. 48(1), 12:1–12:41 (2015)CrossRefGoogle Scholar
  34. 34.
    Radford, B.J., Apolonio, L.M., Trias, A.J., Simpson, J.A.: Network traffic anomaly detection using recurrent neural networks. CoRR abs/1803.10769 (2018)Google Scholar
  35. 35.
    Reddy, R.R., Ramadevi, Y., Sunitha, K.V.N.: Effective discriminant function for intrusion detection using SVM. In: ICACCI, pp. 1148–1153. IEEE (2016)Google Scholar
  36. 36.
    Resende, P.A.A., Drummond, A.C.: A survey of random forest based methods for intrusion detection systems. ACM Comput. Surv. 51(3), 48:1–48:36 (2018)CrossRefGoogle Scholar
  37. 37.
    Roesch, M., et al.: Snort: lightweight intrusion detection for networks. In: LISA, vol. 99, pp. 229–238 (1999)Google Scholar
  38. 38.
    Sallay, H., Bourouis, S.: Intrusion detection alert management for high-speed networks: current researches and applications. Secur. Commun. Netw. 8(18), 4362–4372 (2015)CrossRefGoogle Scholar
  39. 39.
    Segù, M., Loquercio, A., Scaramuzza, D.: A general framework for uncertainty estimation in deep learning. CoRR abs/1907.06890 (2019)Google Scholar
  40. 40.
    Shafi, K., Abbass, H.A.: Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection. Pattern Anal. Appl. 16(4), 549–566 (2013)MathSciNetCrossRefGoogle Scholar
  41. 41.
    Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)CrossRefGoogle Scholar
  42. 42.
    Stein, G., Chen, B., Wu, A.S., Hua, K.A.: Decision tree classifier for network intrusion detection with GA-based feature selection. In: ACM Southeast Regional Conference, vol. 2, pp. 136–141. ACM (2005)Google Scholar
  43. 43.
    Symantec: Internet security threat report (2014)Google Scholar
  44. 44.
    Tiwari, A.: Real-time intrusion detection system using computational intelligence and neural network: review, analysis and anticipated solution of machine learning. In: Chandra, P., Giri, D., Li, F., Kar, S., Jana, D.K. (eds.) Information Technology and Applied Mathematics. AISC, vol. 699, pp. 153–161. Springer, Singapore (2019). Scholar
  45. 45.
    Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., Al-Nemrat, A., Venkatraman, S.: Deep learning approach for intelligent intrusion detection system. IEEE Access 7, 41525–41550 (2019)CrossRefGoogle Scholar
  46. 46.
    Wang, W., et al.: HAST-IDS: learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access 6, 1792–1806 (2018)CrossRefGoogle Scholar
  47. 47.
    Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Data Innovation LabIndustrial University of Ho Chi Minh CityHo Chi Minh CityVietnam

Personalised recommendations