Skip to main content

Phishing Detection with Popular Search Engines: Simple and Effective

  • Conference paper
Foundations and Practice of Security (FPS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 6888))

Included in the following conference series:

Abstract

We propose a new phishing detection heuristic based on the search results returned from popular web search engines such as Google, Bing and Yahoo. The full URL of a website a user intends to access is used as the search string, and the number of results returned and ranking of the website are used for classification. Most of the time, legitimate websites get back large number of results and are ranked first, whereas phishing websites get back no result and/or are not ranked at all.

To demonstrate the effectiveness of our approach, we experimented with four well-known classification algorithms – Linear Discriminant Analysis, Naïve Bayesian, K-Nearest Neighbour, and Support Vector Machine – and observed their performance. The K-Nearest Neighbour algorithm performed best, achieving true positive rate of 98% and false positive and false negative rates of 2%. We used new legitimate websites and phishing websites as our dataset to show that our approach works well even on newly launched websites/webpages – such websites are often misclassified in existing blacklisting and whitelisting approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aaron, G., Rasmussen, R.: Global phishing survey: Trends and domain name use in 2h2009 (May 2010), http://www.antiphishing.org/reports/APWG_GlobalPhishingSurvey_2H2009.pdf

  2. Cao, Y., Han, W., Le, Y.: Anti-phishing based on automated individual white-list. In: DIM 2008: Proceedings of the 4th ACM Workshop on Digital Identity Management, pp. 51–60. ACM, New York (2008)

    Google Scholar 

  3. Chou, N., Ledesma, R., Teraguchi, Y., Mitchell, J.C.: Client-Side Defense Against Web-Based Identity Theft. In: NDSS 2004: Proceedings of the Network and Distributed System Security Symposium (2004)

    Google Scholar 

  4. Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines: and other kernel-based learning methods, 1st edn. Cambridge University Press (March 2000)

    Google Scholar 

  5. Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1281–1285 (2002)

    Article  Google Scholar 

  6. Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29(2-3), 103–130 (1997)

    Article  MATH  Google Scholar 

  7. Fu, A.Y., Wenyin, L., Deng, X.: Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover’s Distance (EMD). IEEE Transactions on Dependable and Secure Computing 3(4), 301–311 (2006)

    Article  Google Scholar 

  8. Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press Professional, Inc., San Diego (1990)

    MATH  Google Scholar 

  9. Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their Applications 13(4), 18–28 (1998)

    Article  Google Scholar 

  10. Bian, K., Park, J.-M., Hsiao, M.S., Belanger, F., Hiller, J.: Evaluation of Online Resources in Assisting Phishing Detection. In: Ninth Annual International Symposium on Applications and the Internet, SAINT 2009, pp. 30–36. IEEE Computer Society, Bellevue (2009)

    Chapter  Google Scholar 

  11. Kim, H., Huh, J.H.: Detecting DNS-poisoning-based phishing attacks from their network performance characteristic. Electronics Letters 47(11), 656–658 (2011)

    Article  Google Scholar 

  12. Kirda, E., Kruegel, C.: Protecting Users Against Phishing Attacks with AntiPhish. In: COMPSAC 2005: Proceedings of the 29th Annual International Computer Software and Applications Conference, pp. 517–524. IEEE Computer Society, Washington, DC, USA (2005)

    Google Scholar 

  13. Latourrette, M.: Toward an Explanatory Similarity Measure for Nearest-Neighbor Classification. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 238–245. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  14. Liu, W., Deng, X., Huang, G., Fu, A.Y.: An Antiphishing Strategy Based on Visual Similarity Assessment. IEEE Internet Computing 10(2), 58–65 (2006)

    Article  Google Scholar 

  15. Moore, T., Clayton, R.: Examining the impact of website take-down on phishing. In: eCrime 2007: Proceedings of the Anti-Phishing Working Groups 2nd Annual eCrime Researchers Summit, pp. 1–13. ACM, New York (2007)

    Google Scholar 

  16. Pan, Y., Ding, X.: Anomaly Based Web Phishing Page Detection. In: ACSAC 2006: Proceedings of the 22nd Annual Computer Security Applications Conference, pp. 381–392. IEEE Computer Society, Washington, DC, USA (2006)

    Google Scholar 

  17. Rish, I.: An empirical study of the naive Bayes classifier. In: Proceedings of IJCAI-2001 Workshop on Empirical Methods in Artificial Intelligence (2001)

    Google Scholar 

  18. Ronda, T., Saroiu, S., Wolman, A.: Itrustpage: a user-assisted anti-phishing tool. ACM SIGOPS Operating Systems Review 42(4), 261–272 (2008)

    Article  Google Scholar 

  19. Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J., Zhang, C.: An empirical analysis of phishing blacklists. In: CEAS 2009: Proceedings of the 6th Conference on Email and Anti-Spam (2009)

    Google Scholar 

  20. Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience (September 1998)

    Google Scholar 

  21. Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2007)

    Article  Google Scholar 

  22. Xiang, G., Hong, J.I.: A hybrid phish detection approach by identity discovery and keywords retrieval. In: WWW 2009: Proceedings of the 18th international conference on World wide web, pp. 571–580. ACM, New York (2009)

    Google Scholar 

  23. Zhang, Y., Egelman, S., Cranor, L., Hong, J.: Phinding Phish: Evaluating Anti-Phishing Tools. In: NDSS 2007: Proceedings of the 14th Annual Network and Distributed System Security Symposium (2007)

    Google Scholar 

  24. Zhang, Y., Hong, J.I., Cranor, L.F.: Cantina: a content-based approach to detecting phishing web sites. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 639–648. ACM, New York (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huh, J.H., Kim, H. (2012). Phishing Detection with Popular Search Engines: Simple and Effective. In: Garcia-Alfaro, J., Lafourcade, P. (eds) Foundations and Practice of Security. FPS 2011. Lecture Notes in Computer Science, vol 6888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27901-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27901-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27900-3

  • Online ISBN: 978-3-642-27901-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics