Domain Generation Algorithm Detection Using Machine Learning Methods

Baruch, Moran; David, Gil

doi:10.1007/978-3-319-75307-2_9

Moran Baruch¹⁸ &
Gil David¹⁸

Part of the book series: Intelligent Systems, Control and Automation: Science and Engineering ((ISCA,volume 93))

2063 Accesses
4 Citations

Abstract

A botnet is a network of private computers infected with malicious software and controlled as a group without the knowledge of the owners. Botnets are used by cybercriminals for various malicious activities, such as stealing sensitive data, sending spam, launching Distributed Denial of Service (DDoS) attacks, etc. A Command and Control (C&C) server sends commands to the compromised hosts to execute those malicious activities. In order to avoid detection, recent botnets such as Conficker, Zeus, and Cryptolocker apply a technique called Domain-Fluxing or Domain Name Generation Algorithms (DGA), in which the infected bot periodically generates and tries to resolve a large number of pseudorandom domain names until one of them is resolved by the DNS server. In this paper, we survey different machine learning methods for detecting such DGAs by analyzing only the alphanumeric characteristics of the domain names in the network. We also propose unsupervised models and evaluate their performance while comparing them with existing supervised models used in previous researches in this field. The proposed unsupervised methods achieve better results than the compared supervised techniques, while detecting zero-day DGAs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abu-Alia A (2015) Detecting domain flux botnet using machine learning techniques, Qatar University, College of Engineering
Google Scholar
Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of dga-based malware. In: Presented as part of the 21st USENIX security symposium (Usenix Security 12), pp 491–506
Google Scholar
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: COLT 1992 Proceedings of the fifth annual workshop on computational learning theory. New York
Google Scholar
Caudill M (1989) Neural nets primer, part VI. AI Expert 4(2):61–67
Google Scholar
Chang Y, Hsieh C, Chang K, Ringgaard M, Lin C (2010) Training and testing low-degree polynomial data mappings via linear SVM. Mach Learn Res 11:1471–1490
MathSciNet MATH Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article Google Scholar
Dennis A, Rossow C, Stone-Gross B, Plohmann D, Bos H (2013) Highly resilient peer-to-peer botnets are here: an analysis of gameover zeus. In: 2013 8th international conference on malicious and unwanted software: the americas (MALWARE), pp 116–123
Google Scholar
Fawcett T (2006) An introduction to ROC analysis. In: An introduction to ROC analysis, pp 861–874
Article MathSciNet Google Scholar
Fitzgibbon N, Wood M (2009) Conficker. C: a technical analysis. Sophos Inc., SophosLabs
Google Scholar
Jaccard P (1901) Distribution de la flore alpine: dans le bassin des dranses et dans quelques régions voisines. Rouge
Google Scholar
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the fourteenth international joint conference on artificial intelligence. San Mateo
Google Scholar
Kullback L, Leibler RA (1951) On information and sufficiency. In: The annals of mathematical statistics. Institute of Mathematical Statistics, pp 79–86
Article MathSciNet Google Scholar
Mcgrath DK, Gupta M (2008) Behind phishing: an examination of phisher modi operandi. In: LEET
Google Scholar
Namazifar M, Pan Y (2015) Research spotlight: detecting algorithmically generated domains. Cisco. http://blogs.cisco.com/security/talos/detecting-dga. Accessed 8 Aug 2015
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv (CSUR) 33(1):31–88
Article Google Scholar
Nguyen T-D, CAO T-D, Nguyen l-G (2015) DGA botnet detection using collaborative filtering and density-based clustering. In: SoICT 2015 Proceedings of the sixth international symposium on information and communication technology. New York
Google Scholar
Panda Security (2015) CryptoLocker: What is and how to avoid it
Google Scholar
Plohnmann D, Gerhards-Padilla E, Leder F (2011) Botnets: detection, measurement disinfection & defence, the european network and information security agency (ENISA)
Google Scholar
Porras PA, Saidi H, Yegneswaran V (2009) A foray into conficker’s logic and rendezvous points. In: LEET
Google Scholar
Rahimian A, Ziharati R, Preda S, Debbabi M (2014) On the reverse engineering of the citadel botnet, In: Foundations and practice of security. Springer International Publishing, pp 408–425
Chapter Google Scholar
Royal P (2008) Analysis of the Kraken Botnet. Damballa
Google Scholar
Sandeep Y, Ashwath Kumar Krishna R, Narasimha RAL, Supranamaya R (2010) Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th annual conference on internet measurement. New York
Google Scholar
Schölkopf B, Smola WRCAJ, Shawe-Taylor J (1999) Support vector method for novelty detection. In: NIPS, vol 12, pp 582–588
Google Scholar
Shevchenko S (2010) Domain name generator for murofet. http://blog.threatexpert.com/2010/10/domain-name-generator-for-murofet.html
Sinegubko D (2009) Twitter API still attracts hackers. http://blog.unmaskparasites.com/2009/12/09/twitter-api-still-attracts-hackers/. Accessed 09 Dec 2009
Stone-Gross B, Cova M, Cavallaro L, Gilbert B, Szydlowski M, Kemmerer R, Kruegel C, Vigna G (2009) Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM conference on computer and communications security. ACM
Google Scholar
Williams DRGHR, Hinton GE (1986) Learning representations by back-propagating errors. Nature 323:533–536
Article Google Scholar
Wolf J (2008) Technical details of Srizbi’s domain generation algorithm. FireEye
Google Scholar
Yazdi S (2014) A closer look at cryptolocker’s DGA, Fortinet. https://blog.fortinet.com/2014/01/16/a-closer-look-at-cryptolocker-s-dga. Accessed 16 Jan 2014

Download references

Author information

Authors and Affiliations

University of Jyvaskyla, Jyvaskyla, Finland
Moran Baruch & Gil David

Authors

Moran Baruch
View author publications
You can also search for this author in PubMed Google Scholar
Gil David
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moran Baruch .

Editor information

Editors and Affiliations

Faculty of Information Technology, University of Jyväskylä , Jyväskylä, Finland
Martti Lehto
Faculty of Information Technology, University of Jyväskylä , Jyväskylä, Finland
Pekka Neittaanmäki

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Baruch, M., David, G. (2018). Domain Generation Algorithm Detection Using Machine Learning Methods. In: Lehto, M., Neittaanmäki, P. (eds) Cyber Security: Power and Technology. Intelligent Systems, Control and Automation: Science and Engineering, vol 93. Springer, Cham. https://doi.org/10.1007/978-3-319-75307-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-75307-2_9
Published: 05 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75306-5
Online ISBN: 978-3-319-75307-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics