Skip to main content

Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks

  • Chapter
  • First Online:

Part of the book series: Studies in Computational Intelligence ((SCI,volume 855))

Abstract

Of the many challenges that continue to make detection of cyber-attack detection elusive, lack of training data remains the biggest one. Even though organizations and business turn to known network monitoring tools such as Wireshark, millions of people are still vulnerable because of lack of information pertaining to website behaviors and features that can amount to an attack. In fact, most of the attacks do not occur because of threat actors’ resort to complex coding and evasion techniques but because victims lack the basic tools to detect and avoid the attacks. Despite these challenges, machine learning is proving to revolutionize the understanding of the nature of cyber-attacks, and this study implemented machine learning techniques to Phishing Website data with the objective of comparing five algorithms and providing insight that the general public can use to avoid phishing pitfalls. The findings of the study suggest that Neural Network is the best performing algorithm and the model suggest that inclusion of an IP address in the domain name, longer URL, use of URL shortening services, inclusion of “@” symbol in the URL, inclusion of “−” symbol in the URL, use of non-trusted SSL certificates with expiry duration less than 6 months, domains registered for less than one year, and favicon redirecting from other URLs as the leading features of phishing websites. Neural Network is based on multi-layer perceptron and is the basis of intelligence so that in future, phishing detection will be automated and rendered an artificial intelligence task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html

  2. Pietraszeka, T., Tanner, A.: Data mining and machine learning—towards reducing false positives in intrusion detection. Inf. Secur. Techn. Rep. 1(3), 169–183 (2005)

    Article  Google Scholar 

  3. Kumar, V., Srivastava, J., Lazarevic, A.: Managing Cyberthreats: Issues, Approaches, and Challenges, vol. 5. Springer Science & Business Media (2006)

    Google Scholar 

  4. Saha, A., Sanyal, S.: Application layer intrusion detection with combination of explicit-rule-based and machine learning algorithms and deployment in cyber- defence program. Int. J. Adv. Netw. Appl. 6(2), 2202–2208 (2014)

    Google Scholar 

  5. Topham, L., et al.: Cyber security teaching and learning laboratories: a survey. Inf. Secur. 35(1), 51–80 (2016)

    Google Scholar 

  6. Bailetti, T., Gad, M., Shah, A.: Intrusion learning: an overview of an emergent discipline. Technol. Innov. Manag. Rev. 6(2), 15–20 (2016)

    Article  Google Scholar 

  7. Dawson, M.: Hyper-Connectivity: Intricacies of National and International Cyber Securities. 10800987th, London Metropolitan University (United Kingdom), Ann Arbor (2017)

    Google Scholar 

  8. Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)

    Google Scholar 

  9. Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cybersecurity intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 133–1176 (2016)

    Article  Google Scholar 

  10. Hallaq, B., et al.: Artificial intelligence within the military domain and cyber warfare (2017)

    Google Scholar 

  11. Hurley, J.S.: Beyond the struggle: artificial intelligence in the department of defense (DoD) (2018)

    Google Scholar 

  12. Pechenkin, A., Demidov, R.: Application of deep neural networks for security analysis of digital infrastructure components (2018)

    Article  Google Scholar 

  13. Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)

    Google Scholar 

  14. Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)

    Google Scholar 

  15. Li, C., Wang, J., Ye, X.: Using a recurrent neural network and restricted Boltzmann machines for malicious traffic detection. NeuroQuantology 16(5) (2018)

    Google Scholar 

  16. Teixeira, M.A., et al.: SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 10(8), 76 (2018)

    Article  Google Scholar 

  17. Ahmad, K., Yousef, M., et al.: Analyzing cyber-physical threats on robotic platforms. Sensors 18(5), 1643 (2018)

    Article  Google Scholar 

  18. Ramotsoela, D., Abu-Mahfouz, A., Hancke, G.: A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors 18(8), 2491 (2018)

    Article  Google Scholar 

  19. Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. SIGCOMM Comput. Commun. Rev. 36(5), 5–16 (2006)

    Article  Google Scholar 

  20. Yamanishi, K., Takeuchi, J., Maruyama, Y.: Data mining for security. NEC J Adv Technol 2(1), 63–69 (2005)

    Google Scholar 

  21. Witten, I.H., Frank, E.: Data Mining—Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier (2005)

    Google Scholar 

  22. Tesink, S.: Improving intrusion detection systems through machine learning (2007). http://ilk.uvt.nl/downloads/pub/papers/thesis-tesink.pdf

  23. Čeponis, D., Goranin, N.: Towards a robust method of dataset generation of malicious activity for anomaly-based HIDS training and presentation of AWSCTD dataset. Baltic J Mod Comput 6(3), 217–234 (2018)

    Article  Google Scholar 

  24. Li, Y., Qiu, R., Jing, S.: Intrusion detection system using Online Sequence Extreme Learning Machine (OS-ELM) in advanced metering infrastructure of smart grid. PLoS ONE 13(2) (2018)

    Article  Google Scholar 

  25. Parrend, P., et al.: Foundations and applications of artificial Intelligence for zero-day and multi-step attack detection. EURASIP J. Inf. Secur. 2018(1), 1–21 (2018)

    Article  Google Scholar 

  26. Siddiqui, M.Z., Yadav, S., Mohd, S.H.: application of artificial intelligence in fighting against cybercrimes: a review. Int. J. Adv. Res. Comput. Sci. 9, 118–121 (2018)

    Article  Google Scholar 

  27. Monks, K., Sitnikova, E., Moustafa, N.: Cyber intrusion detection in operations of bulk handling ports (2018)

    Google Scholar 

  28. Masombuka, M., Grobler, M., Watson, B.: Towards an artificial intelligence framework to actively defend cyberspace (2018)

    Google Scholar 

  29. Zhao, Y., Japkowicz, N.: Anomaly behaviour detection based on the meta-Morisita index for large scale spatio-temporal data set. J. Big Data 5(1), 1–28 (2018)

    Article  Google Scholar 

  30. Eskin, E., Portnoy, L.: Intrusion detection with unlabeled data using clustering. Columbia University, New York (1999)

    Google Scholar 

  31. Duddu, V.: A survey of adversarial machine learning in cyber warfare. Def. Sci. J. 68(4), 356–366 (2018)

    Article  Google Scholar 

  32. Tolubko, V., et al.: Method for determination of cyber threats based on machine learning for real-time information system. Int. J. Intell. Syst. Appl. 10(8), 11 (2018)

    Google Scholar 

  33. Thakong, M., et al.: One-pass-throw-away learning for cybersecurity in streaming non-stationary environments by dynamic stratum network. PLoS ONE 13(9) (2018)

    Article  Google Scholar 

  34. Alawad, H., Kaewunruen, S.: Wireless sensor networks: toward smarter railway stations. Infrastructures 3(3) (2018)

    Article  Google Scholar 

  35. Amsaad, F., et al.: Reliable delay based algorithm to boost PUF security against modeling attacks. Information 9(9) (2018)

    Article  Google Scholar 

  36. Nascimento, Z., Sadok, D.: MODC: a pareto-optimal optimization approach for network traffic classification based on the divide and conquer strategy. Information 9(9) (2018)

    Article  Google Scholar 

  37. Kanatov, M., Atymtayeva, L., Yagaliyeva, B.: Expert systems for information security management and audit. Implementation phase issues. In 2014 Joint 7th International Conference on an Advanced Intelligent Systems (ISIS), 3th International Symposium on Soft Computing and Intelligent Systems (SCIS), pp. 896–900. IEEE (2014)

    Google Scholar 

  38. Eskin, E., Arnold, A., Portnoy, L.: A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data, p. 4. Columbia University, New York (2001)

    Google Scholar 

  39. Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)

    Google Scholar 

  40. Almeida, M. Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 3th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000277–000282. IEEE (2018)

    Google Scholar 

  41. Ford, V., Siraj, A.: Applications of machine learning in cyber security. In: Proceedings of the 27th International Conference on Computer Applications in Industry and Engineering (2014)

    Google Scholar 

  42. Singh, N.: Artificial Neural Networks and Neural Networks Applications [Online] (2017). Available at: https://www.xenonstack.com/blog/data-science/artificial-neural-networks-applications-algorithms/. Accessed 3 Nov 2018

  43. Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: USENIX Security Symposium, pp. 79–93 (1998)

    Google Scholar 

Download references

Acknowledgements

The challenges of accessing reliable cyber security dataset are well documented and a common one among researchers. As such, we are grateful to Rami Mustafa and Lee McCluskey of the University of Huddersfield and Fadi Thabtah of the Canadian University of Dubai for their preparing and sharing the data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Alloghani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Alloghani, M., Al-Jumeily, D., Hussain, A., Mustafina, J., Baker, T., Aljaaf, A.J. (2020). Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks. In: Yang, XS., He, XS. (eds) Nature-Inspired Computation in Data Mining and Machine Learning. Studies in Computational Intelligence, vol 855. Springer, Cham. https://doi.org/10.1007/978-3-030-28553-1_3

Download citation

Publish with us

Policies and ethics