A Framework of New Hybrid Features for Intelligent Detection of Zero Hour Phishing Websites

Nagunwa, Thomas; Naqvi, Syed; Fouad, Shereen; Shah, Hanifa

doi:10.1007/978-3-030-20005-3_4

Thomas Nagunwa¹⁹,
Syed Naqvi¹⁹,
Shereen Fouad¹⁹ &
…
Hanifa Shah¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 951))

Included in the following conference series:

590 Accesses
1 Citations

Abstract

Existing machine learning based approaches for detecting zero hour phishing websites have moderate accuracy and false alarm rates and rely heavily on limited types of features. Phishers are constantly learning their features and use sophisticated tools to adopt the features in phishing websites to evade detections. Therefore, there is a need for continuous discovery of new, robust and more diverse types of prediction features to improve resilience against detection evasions. This paper proposes a framework for predicting zero hour phishing websites by introducing new hybrid features with high prediction performances. Prediction performance of the features was investigated using eight machine learning algorithms in which Random Forest algorithm performed the best with accuracy and false negative rates of 98.45% and 0.73% respectively. It was found that domain registration information and webpage reputation types of features were strong predictors when compared to other feature types. On individual features, webpage reputation features were highly ranked in terms of feature importance weights. The prediction runtime per webpage measured at 7.63 s suggest that our approach has a potential for real time applications. Our framework is able to detect phishing websites hosted in either compromised or dedicated phishing domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Full Qualified Domain Name, also known as hostname of a webpage.

References

Lakshmi, V.S., Vijaya, M.: Efficient prediction of phishing websites using supervised learning algorithms. Procedia Eng. 30, 798–805 (2012)
Article Google Scholar
PhishLabs. https://info.phishlabs.com/2017-phishing-trends-and-intelligence-report-pti. Accessed January 2017
Holz, T., Gorecki, C., Rieck, K., Freiling, F.: Measuring and detecting fast-flux service networks. In: Proceedings of 16th Annual Network & Distributed System Security Symposium (NDSS), San Diego, CA (2008)
Google Scholar
Webroot. https://s3-us-west-1.amazonaws.com/webroot-cms-cdn/8415/0585/3084/Webroot_Quarterly_Threat_Trends_September_2017.pdf. Accessed November 2017
Sophos. https://secure2.sophos.com/en-us/medialibrary/Gated-Assets/white-papers/Dont-Take-The-Bait.pdf?la=en. Accessed August 2017
Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J., Zhang, C.: An empirical analysis of phishing blacklists. In: Proceedings of 6th Conference on Email and Anti-Spam, Mountain View, CA (2009)
Google Scholar
Gupta, B.B., Tewari, A., Jain, A., Agarwal, D.: Fighting against phishing attacks: state of the art and future challenges. Neural Comput. Appl. 28, 3629–3654 (2017)
Article Google Scholar
Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019)
Article Google Scholar
Zuhair, H., Selamat, A., Salleh, M.: New hybrid features for phish website prediction. Int. J. Adv. Soft Comput. Appl. 8, 28–43 (2016)
Google Scholar
Li, Y., Yang, Z., Chen, X., Yuan, H., Liu, W.: A stacking model using URL and HTML features for phishing webpage detection. Future Gener. Comput. Syst. 94, 27–39 (2019)
Article Google Scholar
Jain, A.K., Gupta, B.B.: Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 68, 687–700 (2018)
Article Google Scholar
Mohammad, R.M., Thabtah, F., McCluskey, L.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25, 443–458 (2014)
Article Google Scholar
Gowtham, R., Krishnamurthi, I.: A comprehensive and efficacious architecture for detecting phishing webpages. Comput. Secur. 40, 23–37 (2014)
Article Google Scholar
Feng, F., Zhou, Q., Shen, Z., Yang, X., Han, L., Wang, J.: The application of a novel neural network in the detection of phishing websites. J. Ambient Intell. Human. Comput. 1–15 (2018)
Google Scholar
Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. (2018)
Google Scholar
Xiang, G., Hong, J., Rose, C.P., Cranor, L.: Cantina+ : a feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. (TISSEC) 14, 21 (2011)
Article Google Scholar
MachMetrics. https://www.machmetrics.com/speed-blog/average-page-load-times-websites-2018/. Accessed February 2018

Download references

Acknowledgement

The research leading to the results presented in the paper was partially funded by the UK Commonwealth Scholarship Commission (CSC).

Author information

Authors and Affiliations

School of Computing and Digital Technology, Birmingham City University, Birmingham, UK
Thomas Nagunwa, Syed Naqvi, Shereen Fouad & Hanifa Shah

Authors

Thomas Nagunwa
View author publications
You can also search for this author in PubMed Google Scholar
Syed Naqvi
View author publications
You can also search for this author in PubMed Google Scholar
Shereen Fouad
View author publications
You can also search for this author in PubMed Google Scholar
Hanifa Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Nagunwa .

Editor information

Editors and Affiliations

Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
Francisco Martínez Álvarez
Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
Alicia Troncoso Lora
University of Salamanca, Salamanca, Spain
José António Sáez Muñoz
Department of Industrial Engineering, University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nagunwa, T., Naqvi, S., Fouad, S., Shah, H. (2020). A Framework of New Hybrid Features for Intelligent Detection of Zero Hour Phishing Websites. In: Martínez Álvarez, F., Troncoso Lora, A., Sáez Muñoz, J., Quintián, H., Corchado, E. (eds) International Joint Conference: 12th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2019) and 10th International Conference on EUropean Transnational Education (ICEUTE 2019). CISIS ICEUTE 2019 2019. Advances in Intelligent Systems and Computing, vol 951. Springer, Cham. https://doi.org/10.1007/978-3-030-20005-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-20005-3_4
Published: 28 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20004-6
Online ISBN: 978-3-030-20005-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics