Abstract
The social engineering strategy, used by cyber criminals, to get confidential information from Internet users is called phishing. It continues to trick Internet users into losing time and money each year, besides the loss of productivity. The trends and patterns in such attacks keep on changing over time and hence the detection algorithm needs to be robust and adaptive. Although, many phishing attacks work by luring Internet users to a web site designed to trick them into revealing sensitive information, recently some phishing attacks have been found that work by either installing malware on a computer or by hijacking a good web site. In this paper, we present effective and comprehensive classifiers for both kinds of attacks, classical or hijack-based. To the best of our knowledge, our work is the first to consider hijack-based phishing attacks. Our techniques are also effective at zero-hour phishing web site detection. We focus on the fundamental characteristics of phishing web sites and decompose the classification task for a phishing web site into a URL classifier, a content-based classifier and ways of combining the two. Both the URL classifier and the content-based classifier introduce new features and techniques. We present results of these classifiers and combination schemes on datasets extracted from several sources. We show that: (i) our URL classifier is highly accurate, (ii) our content-based classifier achieves good performance considering the difficulty of the problem and the small size of our white list, and (iii) one of our combination methods achieves superior detection of phishing web sites (over 99.97%) with reasonable false positives of about 3.5 % and another achieves just 0.22% false positives with more than 83% true positive rate. Moreover, our content-based classifier does not need any periodic retraining. Our methods are also language independent.
Research partially supported by NSF grants CNS-1319212 and DUE 1241772.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abu-Nimeh, S., Nappa, D., Wang, X., Nair, S.: A comparison of machine learning techniques for phishing detection. In: Proc. Anti-phishing Working Group’s 2nd Annual eCrime Researchers Summit, pp. 60–69. ACM (2007)
Anti-Phishing Working Group. Phishing activity trends report - h1 2011. In: APWG Phishing Trends Reports (2011)
Basnet, R., Mukkamala, S., Sung, A.: Detection of phishing attacks: A machine learning approach. Soft Computing Applications in Industry, 373–383 (2008)
Bergholz, A., Beer, J.D., Glahn, S., Moens, M.-F., Paaß, G., Strobel, S.: New filtering approaches for phishing email. Journal of Computer Security 18(1), 7–35 (2010)
Bergholz, A., Chang, J., Paaß, G., Reichartz, F., Strobel, S.: Improved phishing detection using model-based features. In: Proc. Conf. on Email and Anti-Spam (CEAS) (2008)
Chandrasekaran, M., Narayanan, K., Upadhyaya, S.: Phishing email detection based on structural properties. In: NYS CyberSecurity Conf. (2006)
Fette, I., Sadeh, N., Tomasic, A.: Learning to detect phishing emails. In: Proc. 16th int’l conf. on World Wide Web, pp. 649–656. ACM (2007)
Gansterer, W.N., Pölz, D.: E-mail classification for phishing defense. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 449–460. Springer, Heidelberg (2009)
Garera, S., Provos, N., Chew, M., Rubin, A.: A framework for detection and measurement of phishing attacks. In: Proc. 2007 ACM Workshop on Recurring Malcode, pp. 1–8 (2007)
Google- Webmaster Central Blog. Working with multilingual websites (2010)
Google Developers. Safe browsing api – google developers (2013)
Hong, J.: The state of phishing attacks. Commun. ACM 55(1), 74–81 (2012)
Jakobsson, M., Myers, S.: Phishing and countermeasures: understanding the increasing problem of electronic identity theft. Wiley-Interscience (2006)
James, L.: Phishing exposed. Syngress Publishing (2005)
Ludl, C., McAllister, S., Kirda, E., Kruegel, C.: On the effectiveness of techniques to detect phishing sites. In: Hämmerli, B.M., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 20–39. Springer, Heidelberg (2007)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Learning to detect malicious urls. ACM TIST 2(3), 30 (2011)
McGrath, D.K., Gupta, M.: Behind phishing: An examination of phisher modi operandi. In: LEET (2008)
Netcraft. Netcraft extension - phishing protection and site reports (2014)
Netscape Communications Corporation. Open directory rdf dump (2004)
Ollmann, G.: The phishing guide. Next Generation Security Software Ltd. (2004)
Sheng, S., Wardman, B., Warner, G., Cranor, L., Hong, J., Zhang, C.: An empirical analysis of phishing blacklists. In: Proc. 6th Conf. on Email and Anti-Spam (2009)
Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M., Kemmerer, R.A., Kruegel, C., Vigna, G.: Your botnet is my botnet: analysis of a botnet takeover. In: ACM Conference on Computer and Communications Security, pp. 635–647 (2009)
Verma, R., Hossain, N.: Semantic feature selection for text with application to automatic phishing email detection. In: Lee, H.-S., Han, D.-G. (eds.) ICISC 2013. LNCS, vol. 8565, pp. 455–468. Springer, Heidelberg (2014)
Verma, R., Shashidhar, N., Hossain, N.: Detecting phishing emails the natural language way. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 824–841. Springer, Heidelberg (2012)
Web of Trust. Safe browsing tool | wot (web of trust) (2014)
Whittaker, C., Ryner, B., Nazif, M.: Large-scale automatic classification of phishing pages. In: Proc. of 17th NDSS (2010)
Xiang, G., Hong, J., Rose, C.P., Cranor, L.: Cantina+: A feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. 14, 21:1–21:28 (2011)
Xiang, G., Hong, J.I.: A hybrid phish detection approach by identity discovery and keywords retrieval. In: Proceedings of the 18th International Conference on World Wide Web, pp. 571–580. ACM (2009)
Xiang, G., Pendleton, B.A., Hong, J., Rose, C.P.: A hierarchical adaptive probabilistic approach for zero hour phish detection. In: Gritzalis, D., Preneel, B., Theoharidou, M. (eds.) ESORICS 2010. LNCS, vol. 6345, pp. 268–285. Springer, Heidelberg (2010)
Yu, W., Nargundkar, S., Tiruthani, N.: Phishcatch-a phishing detection tool. In: 33rd IEEE Int’l Computer Software and Applications Conf., pp. 451–456 (2009)
Zhang, Y., Hong, J., Cranor, L.: Cantina: a content-based approach to detecting phishing web sites. In: Proc. 16th Int’l Conf. on World Wide Web, pp. 639–648. ACM (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Thakur, T., Verma, R. (2014). Catching Classical and Hijack-Based Phishing Attacks. In: Prakash, A., Shyamasundar, R. (eds) Information Systems Security. ICISS 2014. Lecture Notes in Computer Science, vol 8880. Springer, Cham. https://doi.org/10.1007/978-3-319-13841-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-13841-1_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13840-4
Online ISBN: 978-3-319-13841-1
eBook Packages: Computer ScienceComputer Science (R0)