Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks

Alloghani, Mohamed; Al-Jumeily, Dhiya; Hussain, Abir; Mustafina, Jamila; Baker, Thar; Aljaaf, Ahmed J.

doi:10.1007/978-3-030-28553-1_3

Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks

Mohamed Alloghani^4,5,
Dhiya Al-Jumeily⁴,
Abir Hussain⁴,
Jamila Mustafina⁶,
Thar Baker⁴ &
…
Ahmed J. Aljaaf^4,7

Chapter
First Online: 04 September 2019

1483 Accesses
12 Citations
3 Altmetric

Part of the book series: Studies in Computational Intelligence ((SCI,volume 855))

Abstract

Of the many challenges that continue to make detection of cyber-attack detection elusive, lack of training data remains the biggest one. Even though organizations and business turn to known network monitoring tools such as Wireshark, millions of people are still vulnerable because of lack of information pertaining to website behaviors and features that can amount to an attack. In fact, most of the attacks do not occur because of threat actors’ resort to complex coding and evasion techniques but because victims lack the basic tools to detect and avoid the attacks. Despite these challenges, machine learning is proving to revolutionize the understanding of the nature of cyber-attacks, and this study implemented machine learning techniques to Phishing Website data with the objective of comparing five algorithms and providing insight that the general public can use to avoid phishing pitfalls. The findings of the study suggest that Neural Network is the best performing algorithm and the model suggest that inclusion of an IP address in the domain name, longer URL, use of URL shortening services, inclusion of “@” symbol in the URL, inclusion of “−” symbol in the URL, use of non-trusted SSL certificates with expiry duration less than 6 months, domains registered for less than one year, and favicon redirecting from other URLs as the leading features of phishing websites. Neural Network is based on multi-layer perceptron and is the basis of intelligence so that in future, phishing detection will be automated and rendered an artificial intelligence task.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Asuncion, A., Newman, D.J.: UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
Pietraszeka, T., Tanner, A.: Data mining and machine learning—towards reducing false positives in intrusion detection. Inf. Secur. Techn. Rep. 1(3), 169–183 (2005)
Article Google Scholar
Kumar, V., Srivastava, J., Lazarevic, A.: Managing Cyberthreats: Issues, Approaches, and Challenges, vol. 5. Springer Science & Business Media (2006)
Google Scholar
Saha, A., Sanyal, S.: Application layer intrusion detection with combination of explicit-rule-based and machine learning algorithms and deployment in cyber- defence program. Int. J. Adv. Netw. Appl. 6(2), 2202–2208 (2014)
Google Scholar
Topham, L., et al.: Cyber security teaching and learning laboratories: a survey. Inf. Secur. 35(1), 51–80 (2016)
Google Scholar
Bailetti, T., Gad, M., Shah, A.: Intrusion learning: an overview of an emergent discipline. Technol. Innov. Manag. Rev. 6(2), 15–20 (2016)
Article Google Scholar
Dawson, M.: Hyper-Connectivity: Intricacies of National and International Cyber Securities. 10800987th, London Metropolitan University (United Kingdom), Ann Arbor (2017)
Google Scholar
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)
Google Scholar
Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cybersecurity intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 133–1176 (2016)
Article Google Scholar
Hallaq, B., et al.: Artificial intelligence within the military domain and cyber warfare (2017)
Google Scholar
Hurley, J.S.: Beyond the struggle: artificial intelligence in the department of defense (DoD) (2018)
Google Scholar
Pechenkin, A., Demidov, R.: Application of deep neural networks for security analysis of digital infrastructure components (2018)
Article Google Scholar
Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)
Google Scholar
Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)
Google Scholar
Li, C., Wang, J., Ye, X.: Using a recurrent neural network and restricted Boltzmann machines for malicious traffic detection. NeuroQuantology 16(5) (2018)
Google Scholar
Teixeira, M.A., et al.: SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 10(8), 76 (2018)
Article Google Scholar
Ahmad, K., Yousef, M., et al.: Analyzing cyber-physical threats on robotic platforms. Sensors 18(5), 1643 (2018)
Article Google Scholar
Ramotsoela, D., Abu-Mahfouz, A., Hancke, G.: A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors 18(8), 2491 (2018)
Article Google Scholar
Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. SIGCOMM Comput. Commun. Rev. 36(5), 5–16 (2006)
Article Google Scholar
Yamanishi, K., Takeuchi, J., Maruyama, Y.: Data mining for security. NEC J Adv Technol 2(1), 63–69 (2005)
Google Scholar
Witten, I.H., Frank, E.: Data Mining—Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier (2005)
Google Scholar
Tesink, S.: Improving intrusion detection systems through machine learning (2007). http://ilk.uvt.nl/downloads/pub/papers/thesis-tesink.pdf
Čeponis, D., Goranin, N.: Towards a robust method of dataset generation of malicious activity for anomaly-based HIDS training and presentation of AWSCTD dataset. Baltic J Mod Comput 6(3), 217–234 (2018)
Article Google Scholar
Li, Y., Qiu, R., Jing, S.: Intrusion detection system using Online Sequence Extreme Learning Machine (OS-ELM) in advanced metering infrastructure of smart grid. PLoS ONE 13(2) (2018)
Article Google Scholar
Parrend, P., et al.: Foundations and applications of artificial Intelligence for zero-day and multi-step attack detection. EURASIP J. Inf. Secur. 2018(1), 1–21 (2018)
Article Google Scholar
Siddiqui, M.Z., Yadav, S., Mohd, S.H.: application of artificial intelligence in fighting against cybercrimes: a review. Int. J. Adv. Res. Comput. Sci. 9, 118–121 (2018)
Article Google Scholar
Monks, K., Sitnikova, E., Moustafa, N.: Cyber intrusion detection in operations of bulk handling ports (2018)
Google Scholar
Masombuka, M., Grobler, M., Watson, B.: Towards an artificial intelligence framework to actively defend cyberspace (2018)
Google Scholar
Zhao, Y., Japkowicz, N.: Anomaly behaviour detection based on the meta-Morisita index for large scale spatio-temporal data set. J. Big Data 5(1), 1–28 (2018)
Article Google Scholar
Eskin, E., Portnoy, L.: Intrusion detection with unlabeled data using clustering. Columbia University, New York (1999)
Google Scholar
Duddu, V.: A survey of adversarial machine learning in cyber warfare. Def. Sci. J. 68(4), 356–366 (2018)
Article Google Scholar
Tolubko, V., et al.: Method for determination of cyber threats based on machine learning for real-time information system. Int. J. Intell. Syst. Appl. 10(8), 11 (2018)
Google Scholar
Thakong, M., et al.: One-pass-throw-away learning for cybersecurity in streaming non-stationary environments by dynamic stratum network. PLoS ONE 13(9) (2018)
Article Google Scholar
Alawad, H., Kaewunruen, S.: Wireless sensor networks: toward smarter railway stations. Infrastructures 3(3) (2018)
Article Google Scholar
Amsaad, F., et al.: Reliable delay based algorithm to boost PUF security against modeling attacks. Information 9(9) (2018)
Article Google Scholar
Nascimento, Z., Sadok, D.: MODC: a pareto-optimal optimization approach for network traffic classification based on the divide and conquer strategy. Information 9(9) (2018)
Article Google Scholar
Kanatov, M., Atymtayeva, L., Yagaliyeva, B.: Expert systems for information security management and audit. Implementation phase issues. In 2014 Joint 7th International Conference on an Advanced Intelligent Systems (ISIS), 3th International Symposium on Soft Computing and Intelligent Systems (SCIS), pp. 896–900. IEEE (2014)
Google Scholar
Eskin, E., Arnold, A., Portnoy, L.: A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data, p. 4. Columbia University, New York (2001)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Google Scholar
Almeida, M. Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 3th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000277–000282. IEEE (2018)
Google Scholar
Ford, V., Siraj, A.: Applications of machine learning in cyber security. In: Proceedings of the 27th International Conference on Computer Applications in Industry and Engineering (2014)
Google Scholar
Singh, N.: Artificial Neural Networks and Neural Networks Applications [Online] (2017). Available at: https://www.xenonstack.com/blog/data-science/artificial-neural-networks-applications-algorithms/. Accessed 3 Nov 2018
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: USENIX Security Symposium, pp. 79–93 (1998)
Google Scholar

Download references

Acknowledgements

The challenges of accessing reliable cyber security dataset are well documented and a common one among researchers. As such, we are grateful to Rami Mustafa and Lee McCluskey of the University of Huddersfield and Fadi Thabtah of the Canadian University of Dubai for their preparing and sharing the data.

Author information

Authors and Affiliations

Liverpool John Moores University, Liverpool, L3 3AF, UK
Mohamed Alloghani, Dhiya Al-Jumeily, Abir Hussain, Thar Baker & Ahmed J. Aljaaf
Abu Dhabi Health Services Company (SEHA), Abu Dhabi, UAE
Mohamed Alloghani
Kazan Federal University, Kazan, Russia
Jamila Mustafina
Centre of Computer, University of Anbar, Ramadi, Iraq
Ahmed J. Aljaaf

Authors

Mohamed Alloghani
View author publications
You can also search for this author in PubMed Google Scholar
Dhiya Al-Jumeily
View author publications
You can also search for this author in PubMed Google Scholar
Abir Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Jamila Mustafina
View author publications
You can also search for this author in PubMed Google Scholar
Thar Baker
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed J. Aljaaf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Alloghani .

Editor information

Editors and Affiliations

School of Science and Technology, Middlesex University, London, UK
Xin-She Yang
College of Science, Xi’an Polytechnic University, Xi’an, China
Xing-Shi He

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alloghani, M., Al-Jumeily, D., Hussain, A., Mustafina, J., Baker, T., Aljaaf, A.J. (2020). Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks. In: Yang, XS., He, XS. (eds) Nature-Inspired Computation in Data Mining and Machine Learning. Studies in Computational Intelligence, vol 855. Springer, Cham. https://doi.org/10.1007/978-3-030-28553-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-28553-1_3
Published: 04 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28552-4
Online ISBN: 978-3-030-28553-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics