Advertisement

Detecting Malicious Websites by Query Templates

  • Satomi KanekoEmail author
  • Akira Yamada
  • Yukiko Sawaya
  • Tran Phuong Thao
  • Ayumu Kubota
  • Kazumasa Omote
Conference paper
  • 36 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12001)

Abstract

With the development of the Internet, web content is exponentially increasing. Along with this, web-based attacks such as drive-by download attacks and phishing have grown year on year. To prevent such attacks, URL blacklists are widely used. However, URL blacklists are not enough because they lack the ability to detect newly generated malicious URLs. In this paper, we propose an automatic query template generation method to detect malicious websites. Our method focus on URL query strings that contained similarities on malicious website groups. Additionally, we evaluate our proposed method with large-scale dataset and verify effectiveness. Consequently, our proposed method can grasp the characteristics of malicious campaigns; it can detect 11,292 malicious unique domains not detected by Google Safe Browsing. Moreover, our method achieved high precision in the seven months of experiments.

Keywords

Web security Web-based attacks Phishing Malicious websites detection 

Notes

Acknowledgments

The research results have been achieved by WarpDrive: Web-based Attack Response with Practical and Deployable Research InitiatiVE, the Commissioned Research of National Institute of Information and Communications Technology (NICT), JAPAN.

References

  1. 1.
    Abdelhamid, N., Ayesh, A., Thabtah, F.: Phishing detection based associative classification data mining. Expert Syst. Appl. 41(13), 5948–5959 (2014).  https://doi.org/10.1016/j.eswa.2014.03.019. http://www.sciencedirect.com/science/article/pii/S0957417414001481
  2. 2.
    Ding, Y., Luktarhan, N., Li, K., Slamu, W.: A keyword-based combination approach for detecting phishing webpages. Comput. Secur. 84, 256–275 (2019).  https://doi.org/10.1016/j.cose.2019.03.018. http://www.sciencedirect.com/science/article/pii/S0167404819300707
  3. 3.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996). http://dl.acm.org/citation.cfm?id=3001460.3001507
  4. 4.
    Google: Collect campaign data with custom URLs. https://support.google.com/analytics/answer/1033863
  5. 5.
    Google: Google safe browsing. https://safebrowsing.google.com
  6. 6.
    Kim, S., Kim, J., Kang, B.: Malicious URL protection based on attackers’ habitual behavioral analysis. Comput. Secur. (2018).  https://doi.org/10.1016/j.cose.2018.01.013
  7. 7.
    Malwarebytes: Cybercrime tactics and techniques Q1 2017. https://www.malwarebytes.com/pdf/labs/Cybercrime-Tactics-and-Techniques-Q1-2017.pdf
  8. 8.
    Mizuno, S., Hatada, M., Mori, T., Goto, S.: Detecting malware-infected devices using the http header patterns. IEICE Trans. Inf. Syst. E101D(5), 1370–1379 (2018).  https://doi.org/10.1587/transinf.2017EDP7294CrossRefGoogle Scholar
  9. 9.
    SeleniumHQ: Selenium WebDriver. https://docs.seleniumhq.org/projects/webdriver
  10. 10.
    Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J.I., Zhang, C.: An empirical analysis of phishing blacklists. In: Conference on Email and Anti-Spam (2009)Google Scholar
  11. 11.
    Sood, A.K., Zeadally, S.: A taxonomy of domain-generation algorithms. IEEE Secur. Privacy 14(4), 46–53 (2016).  https://doi.org/10.1109/MSP.2016.76CrossRefGoogle Scholar
  12. 12.
    Symantec: Internet security threat report volume 24. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf
  13. 13.
    Thao, T.P., Makanju, T., Urakawa, J., Yamada, A., Murakami, K., Kubota, A.: Large-scale analysis of domain blacklists. In: Proceedings of the 11th International Conference on Emerging Security Information, Systems and Technologies (2017)Google Scholar
  14. 14.
    Verisign: Internet grows to 348.7 million domain name registrations in the fourth quarter of 2018. https://investor.verisign.com/news-releases/news-release-details/internet-grows-3487-million-domain-name-registrations-fourth
  15. 15.
    Zhang, J., Seifert, C., Stokes, J., Lee, W.: Arrow: generating signatures to detect drive-by downloads. In: Proceedings of the International Conference on World Wide Web (2011). https://www.microsoft.com/en-us/research/publication/arrow-generating-signatures-to-detect-drive-by-downloads/

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Satomi Kaneko
    • 1
    Email author
  • Akira Yamada
    • 2
  • Yukiko Sawaya
    • 2
  • Tran Phuong Thao
    • 3
  • Ayumu Kubota
    • 2
  • Kazumasa Omote
    • 1
  1. 1.University of TsukubaTsukubaJapan
  2. 2.KDDI Research, Inc.FujiminoJapan
  3. 3.The University of TokyoBunkyoJapan

Personalised recommendations