Advertisement

DBD: Deep Learning DGA-Based Botnet Detection

  • R. VinayakumarEmail author
  • K. P. Soman
  • Prabaharan Poornachandran
  • Mamoun Alazab
  • Alireza Jolfaei
Chapter
Part of the Advanced Sciences and Technologies for Security Applications book series (ASTSA)

Abstract

Botnets play an important role in malware distribution and they are widely used for spreading malicious activities in the Internet. The study of the literature shows that a large subset of botnets use DNS poisoning to spread out malicious activities and that there are various methods for their detection using DNS queries. However, since botnets generate domain names quite frequently, the resolution of domain names can be very time consuming. Hence, the detection of botnets can be extremely difficult. This chapter propose a novel deep learning framework to detect malicious domains generated by malicious Domain Generation Algorithms (DGA). The proposed DGA detection method, named, Deep Bot Detect (DBD) is able to evaluate data from large scale networks without reverse engineering or performing Non-Existent Domain (NXDomain) inspection. The framework analyzes domain names and categorizes them using statistical features, which are extracted implicitly through deep learning architectures. The framework is tested and deployed in our lab environment. The experimental results demonstrate the effectiveness of the proposed framework and shows that the proposed method has high accuracy and low false-positive rates. The proposed framework is a simple architecture that contains fewer learnable parameters compared to other character-based, short text classification models. Therefore, the proposed framework is faster to train and is less prone to over-fitting. The framework provides an early detection mechanism for the identification of Domain-Flux botnets propagating in a network and it helps keep the Internet clean from related malicious activities.

Keywords

Botnet Deep learning Domain name generation Malware Cybercrime Cyber security Domain-flux Keras embedding 

Notes

Acknowledgements

This work was supported by the Department of Corporate and Information Services, Northern Territory Government of Australia, the Paramount Computer Systems, and Lakhshya Cyber Security Labs. We would like to thank NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.

References

  1. 1.
    Broadhurst R, Grabosky P, Alazab M, Bouhours B, Chon S (2014) An analysis of the nature of groups engaged in cyber crime. Int J Cyber Criminol 8(1):1–20. Available at SSRN: https://ssrn.com/abstract=2461983
  2. 2.
    Alazab M, Venkatraman S, Watters P, Alazab M, Alazab A (2012) Cybercrime: the case of obfuscated malware. In: Georgiadis CK, Jahankhani H, Pimenidis E, Bashroush R, Al-Nemrat A (eds) Global security, safety and sustainability and e-Democracy. e-Democracy 2011, ICGS3 2011. Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 99. Springer, Berlin/HeidelbergGoogle Scholar
  3. 3.
    Alazab M, Broadhurst R (2016) Spam and criminal activity (2015). Trends and issues in crime and criminal justice, No. 526, Australian Institute of Criminology. https://aic.gov.au/publications/tandi/tandi526
  4. 4.
    Ahmad U, Song H, Bilal A, Alazab M, Jolfaei A (2018) Secure passive keyless entry and start system using machine learning. In: Wang G, Chen J, Yang L (eds) Security, privacy, and anonymity in computation, communication, and storage. SpaCCS 2018. Lecture notes in computer science, vol 11342. Springer, ChamCrossRefGoogle Scholar
  5. 5.
    Vinayakumar R, Alazab M, Jolfaei A, Soman KP, Poornachandran P (2018) Ransomware triage using deep learning: twitter as a case study. In: International Conference on Cyber Security and Communication Systems. Springer, MelbourneGoogle Scholar
  6. 6.
    Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw 100:91–102. ElsevierGoogle Scholar
  7. 7.
    Antonakakis M, Perdisci R, Dagon D, Lee W, Feamster N (2010) Building a dynamic reputation system for DNS. In: USENIX Security Symposium, pp 273–290Google Scholar
  8. 8.
    Ollmann G (2009) Botnet communication topologies. Retrieved 30 Sept 2009. Available at http://www.technicalinfo.net/papers/PDF/WP_Botnet_Communications_Primer_(2009-06-04).pdf
  9. 9.
    Wang TS, Lin HT, Cheng WT, Chen CY (2017) DBod: clustering and detecting DGA-based botnets using DNS traffic analysis. Comput Secur 64:1–15CrossRefGoogle Scholar
  10. 10.
    Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: USENIX Security Symposium, vol 12Google Scholar
  11. 11.
    Allamanis M, Barr ET, Devanbu P, Sutton C (2018) A survey of machine learning for big code and naturalness. ACM Comput Surv 51(4):81CrossRefGoogle Scholar
  12. 12.
    Joyce JM (2011) Kullback-Leibler divergence. In: Lovric M (ed) International encyclopedia of statistical science. Springer, Berlin/Heidelberg, pp 720–722CrossRefGoogle Scholar
  13. 13.
    Tang M, Alazab M, Luo Y (2017) Big data mining in cybersecurity. IEEE Transactions on Big Data. https://ieeexplore.ieee.org/document/7968482
  14. 14.
    Abadi M et al (2016) Tensorflow: a system for large-scale machine learning. In: OSDI, vol 16, pp 265–283Google Scholar
  15. 15.
    Silva SS, Silva RM, Pinto RC, Salles RM (2013) Botnets: a survey. Comput Netw 57(2): 378–403CrossRefGoogle Scholar
  16. 16.
    Jolfaei A, Vizandan A, Mirghadri A (2012) Image encryption using HC-128 and HC-256 stream ciphers. Int J Electron Secur Digit Forensics 4(1):19–42CrossRefGoogle Scholar
  17. 17.
    Ahluwalia A, Traore I, Ganame K, Agarwal N (2017) Detecting broad length algorithmically generated domains. In: International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. Springer, Cham, pp 19–34Google Scholar
  18. 18.
    Woodbridge J, Anderson HS, Ahuja A, Grant D (2016) Predicting domain generation algorithms with long short-term memory networks. arXiv preprint arXiv:1611.00791Google Scholar
  19. 19.
    Yu B, Gray DL, Pan J, De Cock M, Nascimento AC (2017) Inline DGA detection with deep networks. In: IEEE International Conference on Data Mining Workshops, pp 683–692Google Scholar
  20. 20.
    Lison P, Mavroeidis V (2017) Automatic detection of malware-generated domains with recurrent neural models. arXiv preprint arXiv:1709.07102Google Scholar
  21. 21.
    Vinayakumar R, Soman KP, Poornachandran P, Sachin Kumar S (2018) Evaluating deep learning approaches to characterize and classify the DGAs at scale. J Intell Fuzzy Syst 34(3):1265–1276CrossRefGoogle Scholar
  22. 22.
    Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell Fuzzy Syst 34(3):1355–1367CrossRefGoogle Scholar
  23. 23.
    Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Big data in engineering applications, pp 113–142Google Scholar
  24. 24.
    Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) Spoof Net: syntactic patterns for identification of ominous online factors. In: IEEE Security and Privacy Workshops, pp 258–263Google Scholar
  25. 25.
    Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and Email data analysis. J Cyber Secur Mobil 8(2):189–240CrossRefGoogle Scholar
  26. 26.
    Maaten LVD, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605zbMATHGoogle Scholar
  27. 27.
    Curtin RR, Gardner AB, Grzonkowski S, Kleymenov A, Mosquera A (2018) Detecting DGA domains with recurrent neural networks and side information. arXiv preprint arXiv:1810.02023Google Scholar
  28. 28.
    Tran D, Mac H, Tong V, Tran HA, Nguyen LG (2018) A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing 275:2401–2413CrossRefGoogle Scholar
  29. 29.
    Feng Z, Shuo C, Xiaochuan W (2017) Classification for DGA-based malicious domain names with deep learning architectures. In: 2017 Second International Conference on Applied Mathematics and Information Technology, p 5Google Scholar
  30. 30.
    Mac H, Tran D, Tong V, Nguyen LG, Tran HA (2017) DGA botnet detection using supervised learning methods. In: Proceedings of the Eighth ACM International Symposium on Information and Communication Technology, pp 211–218Google Scholar
  31. 31.
    Yu B, Pan J, Hu J, Nascimento A, De Cock M (2018) Character level based detection of DGA domain names. In: IEEE World Congress on Computational Intelligence, pp 4168–4175Google Scholar
  32. 32.
    Saxe J, Berlin K (2017) eXpose: a character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys. arXiv preprint arXiv:1702.08568Google Scholar
  33. 33.
    Dhingra B, Zhou Z, Fitzpatrick D, Muehl M, Cohen WW (2016) Tweet2vec: character-based distributed representations for social media. arXiv preprint arXiv:1605.03481Google Scholar
  34. 34.
    Jolfaei A, Mirghadri A (2011) Substitution-permutation based image cipher using chaotic henon and baker’s maps. Int Rev Comput Softw 6(1):40–54Google Scholar
  35. 35.
    Vosoughi S, Vijayaraghavan P, Roy D (2016) Tweet2vec: learning tweet embeddings using character-level CNN-LSTM encoder-decoder. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 1041–1044Google Scholar
  36. 36.
    Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657Google Scholar
  37. 37.
    Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX Conference on Security SymposiumGoogle Scholar
  38. 38.
    Does Alexa have a list of its top-ranked websites? Available at Alexa: https://support.alexa.com. Accessed 09 May 2018
  39. 39.
    OpenDNS domain list. Available at OpenDNS: https://umbrella.cisco.com/blog. Accessed 09 May 2018
  40. 40.
    OSINT DGA feeds. Available at OSINT: https://osint.bambenekconsulting.com/. Accessed 09/05/2018
  41. 41.
    DGArchive. Available at DGArchive: https://dgarchive.caad.fkie.fraunhofer.de/. Accessed 15/06/2018
  42. 42.
    DGA Algorithms. Available at Github: https://github.com/baderj/domain generation algorithms. Accessed 01/05/2018
  43. 43.
    Gulli A, Pal S (2017) Deep learning with Keras. Packt Publishing Ltd, BirminghamGoogle Scholar
  44. 44.
    Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE AccessGoogle Scholar
  45. 45.
    Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE AccessGoogle Scholar
  46. 46.
    Venkatraman S, Alazab M (2018) Use of data visualisation for zero-day malware detection. Security and Communication Networks 2018:13. Article ID: 1728303. https://doi.org/10.1155/2018/1728303 CrossRefGoogle Scholar
  47. 47.
    Ebenuwa S, Sharif MS, Alazab M, Al-Nemrat A (2019) Variance ranking attributes selection techniques for binary classification problem in imbalance data. IEEE AccessGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • R. Vinayakumar
    • 1
    Email author
  • K. P. Soman
    • 1
  • Prabaharan Poornachandran
    • 2
  • Mamoun Alazab
    • 3
  • Alireza Jolfaei
    • 4
  1. 1.Center for Computational Engineering and Networking (CEN)Amrita School of Engineering, Amrita Vishwa VidyapeethamCoimbatoreIndia
  2. 2.Centre for Cyber Security Systems and NetworksAmrita School of Engineering, Amrita Vishwa VidyapeethamAmritapuriIndia
  3. 3.Charles Darwin UniversityCasuarinaAustralia
  4. 4.Department of ComputingMacquarie UniversitySydneyAustralia

Personalised recommendations