Google Dorks: Analysis, Creation, and New Defenses

Toffalini, Flavio; Abbà, Maurizio; Carra, Damiano; Balzarotti, Davide

doi:10.1007/978-3-319-40667-1_13

Flavio Toffalini¹⁶,
Maurizio Abbà¹⁷,
Damiano Carra¹⁶ &
…
Davide Balzarotti¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9721))

Included in the following conference series:

International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

2419 Accesses
6 Citations

Abstract

With the advent of Web 2.0, many users started to maintain personal web pages to show information about themselves, their businesses, or to run simple e-commerce applications. This transition has been facilitated by a large number of frameworks and applications that can be easily installed and customized. Unfortunately, attackers have taken advantage of the widespread use of these technologies – for example by crafting special search engines queries to fingerprint an application framework and automatically locate possible targets. This approach, usually called Google Dorking, is at the core of many automated exploitation bots.

In this paper we tackle this problem in three steps. We first perform a large-scale study of existing dorks, to understand their typology and the information attackers use to identify their target applications. We then propose a defense technique to render URL-based dorks ineffective. Finally we study the effectiveness of building dorks by using only combinations of generic words, and we propose a simple but effective way to protect web applications against this type of fingerprinting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Not all dorks have been correctly classified automatically, so we manually inspected the results to ensure a correct classification.
2.
Here we assume that search engines do not try to disguise their requests, as it is the case for all the popular ones we encountered in our study.
3.
For efficiency reasons, we compute the hit rank by visiting a random sample that covers 30 % of the first 1000 results.

References

Long, J., Skoudis, E.: Google Hacking for Penetration Testers. Syngress, Rockland (2005)
Google Scholar
Provos, N., McClain, J., Wang, K.: Search worms. In: Proceedings of the 4th ACM Workshop on Recurring Malcode, pp. 1–8 (2006)
Google Scholar
Christodorescu, M., Fredrikson, M., Jha, S., Giffin, J.: End-to-end software diversification of internet services. Moving Target Defense 54, 117–130 (2011)
Article Google Scholar
Zhang, J., Notani, J., Gu, G.: Characterizing Google hacking: a first large-scale quantitative study. In: Tian, J., et al. (eds.) SecureComm 2014. LNICST, vol. 152, pp. 602–622. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23829-6_46
Chapter Google Scholar
Johnny Google hacking database. http://johnny.ihackstuff.com/ghdb/
Exploit database. https://www.exploit-db.com/
Yandex cloacking condition. https://yandex.com/support/webmaster/yandex-indexing/webmaster-advice.xml
Baidu cloacking condition. http://baike.baidu.com/item/Cloaking
Google cloacking condition. https://support.google.com/webmasters/answer/66355?hl=en
Wappalyzer-python. https://github.com/scrapinghub/wappalyzer-python
meanpath. https://meanpath.com/
Blind elephant. https://community.qualys.com/community/blindelephant
Whatweb. http://www.morningstarsecurity.com/research/whatweb
Moore, T., Clayton, R.: Evil searching: compromise and recompromise of internet hosts for phishing. In: Dingledine, R., Golle, P. (eds.) FC 2009. LNCS, vol. 5628, pp. 256–272. Springer, Heidelberg (2009)
Chapter Google Scholar
John, J.P., Yu, F., Xie, Y., Abadi, M., Krishnamurthy, A.: Searching the searchers with searchaudit. In: Proceedings of the 19th USENIX Conference on Security, Berkeley, CA, USA, p. 9 (2010)
Google Scholar
John, J.P., Yu, F., Xie, Y., Krishnamurthy, A., Abadi, M.: Heat-seeking honeypots: design and experience. In: Proceedings of WWW, pp. 207–216 (2011)
Google Scholar
Michael, K.: Hacking: The Next Generation. Elsevier Advanced Technology, Oxford (2012)
Google Scholar
Google advanced operators. https://support.google.com/websearch/answer/2466433?hl=en
Bing advanced operators. https://msdn.microsoft.com/en-us/library/ff795667.aspx
Lancor, L., Workman, R.: Using Google hacking to enhance defense strategies. In: Proceedings of the 38th SIGCSE Technical Symposium on Computer Science Education, pp. 491–495 (2007)
Google Scholar
Pelizzi, R., Tran, T., Saberi, A.: Large-scale, automatic XSS detection using Google dorks (2011)
Google Scholar
Invernizzi, L., Comparetti, P.M., Benvenuti, S., Kruegel, C., Cova, M., Vigna, G.: Evilseed: a guided approach to finding malicious web pages. In: IEEE Symposium on Security and Privacy, pp. 428–442 (2012)
Google Scholar
Zhang, J., Yang, C., Xu, Z., Gu, G.: PoisonAmplifier: a guided approach of discovering compromised websites through reversing search poisoning attacks. In: Balzarotti, D., Stolfo, S.J., Cova, M. (eds.) RAID 2012. LNCS, vol. 7462, pp. 230–253. Springer, Heidelberg (2012)
Chapter Google Scholar
Billig, J., Danilchenko, Y., Frank, C.E.: Evaluation of Google hacking. In: Proceedings of the 5th Annual Conference on Information Security Curriculum Development, pp. 27–32. ACM (2008)
Google Scholar
Gooscan. http://www.aldeid.com/wiki/Gooscan
Keßler, M., Lucks, S., Tatlı, E.I.: Tracking dog-a privacy tool against Google hacking. In: CoseC b-it, p. 8 (2007)
Google Scholar
Pulp google hacking: the next generation search engine hacking arsenal
Google Scholar
Sahito, F., Slany, W., Shahzad, S.: Search engines: the invader to our privacy - a survey. In: International Conference on Computer Sciences and Convergence Information Technology, pp. 640–646, November 2011
Google Scholar
Tatlı, E.I.: Google hacking against privacy (2007)
Google Scholar
Tatlı, E.I.: Google reveals cryptographic secrets. In: Kryptowochenende 2006-Workshop über Kryptographie Universität Mannheim, p. 33 (2006)
Google Scholar
Soska, K., Christin, N.: Automatically detecting vulnerable websites before they turn malicious. In: Proceedings of USENIX Security, San Diego, CA, pp. 625–640 (2014)
Google Scholar
Vasek, M., Moore, T.: Identifying risk factors for webserver compromise. In: Financial Cryptography and Data Security, pp. 326–345 (2014)
Google Scholar
Cho, C.Y., Caballero, J., Grier, C., Paxson, V., Song, D.: Insights from the inside: a view of botnet management from infiltration. In: Proceedings of the USENIX Workshop on Large-Scale Exploits and Emergent Threats, San Jose, CA, April 2010
Google Scholar
Yu, F., Xie, Y., Ke, Q.: Sbotminer: large scale search bot detection. In: ACM International Conference on Web Search and Data Mining, February 2010
Google Scholar

Download references

Author information

Authors and Affiliations

University of Verona, Verona, Italy
Flavio Toffalini & Damiano Carra
LastLine, London, UK
Maurizio Abbà
Eurecom, Sophia-Antipolis, France
Davide Balzarotti

Authors

Flavio Toffalini
View author publications
You can also search for this author in PubMed Google Scholar
Maurizio Abbà
View author publications
You can also search for this author in PubMed Google Scholar
Damiano Carra
View author publications
You can also search for this author in PubMed Google Scholar
Davide Balzarotti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Flavio Toffalini .

Editor information

Editors and Affiliations

IMDEA Software Institute, Pozuelo de Alarcón, Madrid, Spain
Juan Caballero
Mondragon University, Arrasate, Guipúzcoa, Spain
Urko Zurutuza
Universidad de Zaragoza, Zaragoza, Spain
Ricardo J. Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Toffalini, F., Abbà, M., Carra, D., Balzarotti, D. (2016). Google Dorks: Analysis, Creation, and New Defenses. In: Caballero, J., Zurutuza, U., Rodríguez, R. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2016. Lecture Notes in Computer Science(), vol 9721. Springer, Cham. https://doi.org/10.1007/978-3-319-40667-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-40667-1_13
Published: 12 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40666-4
Online ISBN: 978-3-319-40667-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics