Advertisement

Enhanced Domain Generating Algorithm Detection Based on Deep Neural Networks

  • Amara Dinesh Kumar
  • Harish Thodupunoori
  • R. Vinayakumar
  • K. P. Soman
  • Prabaharan Poornachandran
  • Mamoun Alazab
  • Sitalakshmi Venkatraman
Chapter
Part of the Advanced Sciences and Technologies for Security Applications book series (ASTSA)

Abstract

In recent years, modern botnets employ the technique of domain generation algorithm (DGA) to evade detection solutions that use either reverse engineering methods, or blacklisting of malicious domain names. DGA facilitates generation of large number of pseudo random domain names to connect to the command and control server. This makes DGAs very convincing for botnet operators (botmasters) to make their botnets more effective and resilient to blacklisting and efforts of shutting-down attacks. Detecting the malicious domains generated by the DGAs in real time is the most challenging task and significant research has been carried out by applying different machine learning algorithms. This research considers contemporary state-of-the-art DGA malicious detection approaches and proposes a deep learning architecture for detecting the DGA generated domain names.

This chapter presents extensive experiments conducted with various Deep Neural Networks (DNN), mainly, convolutional neural network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Long Short-Term Memory (BiLSTM), Bidirectional Recurrent Neural Network (BiRNN) and CNN-LSTM layers deep learning architectures for the binary class and multi-class detection. An extensive study of the performance and efficiency of the proposed DGA Malicious Detector is conducted through rigorous experimentation and testing of two different datasets. The first dataset consists of public sources and the second dataset is from private sources. We perform a comprehensive measurement study of the DGA by analyzing more than three Million domain names. Our experiments show our DGA Malicious Detector is capable of effectively identifying domains generated by DGA families with high accuracy of 99.7% and 97.1% for the two datasets respectively. A comparative study of the deep learning approaches shows good benchmarking of our DGA Malicious Detector.

Keywords

Domain generation algorithm (DGA) Cybersecurity Malware Botnet DNS Deep learning 

Notes

Acknowledgement

This work was supported by the Department of Corporate and Information Services, Northern Territory Government of Australia, the Paramount Computer Systems, and Lakhshya Cyber Security Labs. We would like to thank NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.

References

  1. 1.
    Broadhurst R, Grabosky P, Alazab M, Bouhours B, Chon S (2014) An analysis of the nature of groups engaged in cyber crime. Int J Cyber Criminol 8(1):1–20Google Scholar
  2. 2.
    Alazab M, Venkatraman S, Watters P, Alazab M, Alazab A (2012) Cybercrime: the case of obfuscated malware. In: Global security, safety and sustainability & e-democracy. Springer, New York, pp 204–211CrossRefGoogle Scholar
  3. 3.
    Alazab M, Broadhurst R (2016) Spam and criminal activity, Trends & issues in crime and criminal justice, No. 526. Australian Institute of Criminology, CanberraGoogle Scholar
  4. 4.
    Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw, Elsevier 100:91–102CrossRefGoogle Scholar
  5. 5.
    Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX conference on security symposium (Security’12). USENIX Association, Berkeley, pp 24–24Google Scholar
  6. 6.
    Schiavoni S, Maggi F, Cavallaro L, Zanero S (2014) Phoenix: Dgabased botnet tracking and intelligence. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, Cham, pp 192–211Google Scholar
  7. 7.
    Lison P, Mavroeidis V (2017) Automatic detection of malware-generated domains with recurrent neural models. arXiv preprint arXiv:1709.07102Google Scholar
  8. 8.
    Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious urls. J Intell Fuzzy Syst 34(3):1333–1343CrossRefGoogle Scholar
  9. 9.
    Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell Fuzzy Syst 34(3):1355–1367CrossRefGoogle Scholar
  10. 10.
    Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and email data analysis. J Cyber Secur Mobil 8(2):189–240CrossRefGoogle Scholar
  11. 11.
    Geffner J (2013) End-to-end analysis of a domain generating algorithm malware family. Black Hat USAGoogle Scholar
  12. 12.
    Plohmann D, Yakdan K, Klatt M, Bader J, GerhardsPadilla E, A comprehensive measurement study of domain generating malwareGoogle Scholar
  13. 13.
    Pereira M, Coleman S, Yu B, DeCock M, Nascimento A (2018) Dictionary extraction and detection of algorithmically generated domain names in passive dns traffic. In: International symposium on research in attacks, intrusions, and defenses. Springer, pp 295–314Google Scholar
  14. 14.
    Baruch M, David G (2018) Domain generation algorithm detection using machine learning methods. Springer, Cham, pp 133–161Google Scholar
  15. 15.
    Kuhrer M, Rossow C, Holz T (2014) Paint it black: evaluating the effectiveness of malware blacklists. In: International workshop on recent advances in intrusion detection. Springer, pp 1–21Google Scholar
  16. 16.
    Zhauniarovich Y, Khalil I, Yu T, Dacier M (2018) A survey on malicious domains detection through dns data analysis. arXiv preprint arXiv:1805.08426Google Scholar
  17. 17.
    Azab A, Alazab M, Aiash M (2016) Machine learning based botnet identification traffic. In: Trustcom/BigDataSE/I SPA, 2016 IEEE. IEEE, pp 1788–1794Google Scholar
  18. 18.
    Yadav S, Reddy AKK, Reddy AL, Ranjan S (2010) Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM conference on internet measurement. ACM, New York, pp 48–61Google Scholar
  19. 19.
    Yadav S, Reddy AKK, Narasimha Reddy AL, Ranjan S (2012) Detecting algorithmically generated domain-flux attacks with dns traffic analysis. IEEE/ACM Trans Networking 20(5):1663–1677CrossRefGoogle Scholar
  20. 20.
    Woodbridge J, Anderson HS, Ahuja A, Grant D (2016) Predicting domain generation algorithms with long short-term memory networks. CoRR, abs/1611.00791Google Scholar
  21. 21.
    Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy S, Samui P, Deo R, Ntalampiras S (eds) Big data in engineering applications. Studies in big data, vol 44. Springer, Singapore, pp 113–142CrossRefGoogle Scholar
  22. 22.
    Stone-Gross B, Cova M, Cavallaro L, Gilbert B, Szydlowski M, Kemmerer R, Kruegel C, Vigna G (2009) Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM conference on Computer and Communications Security, CCS ‘09. ACM, New York, pp 635–647Google Scholar
  23. 23.
    Bilge L, Sen S, Balzarotti D, Kirda E, Kruegel C (2014) Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans Inf Syst Secur 16(4):1–28.  https://doi.org/10.1145/2584679 CrossRefGoogle Scholar
  24. 24.
    Zhou Y, Li Q-s, Miao Q, Yim K (2013) Dga-based botnet detection using dns traffic. J Internet Serv Inf Secur 3(3/4):116–123Google Scholar
  25. 25.
    Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. (2003) A practical guide to support vector classificationGoogle Scholar
  26. 26.
    Wang Z, Jia Z, Zhang B (2018) A detection scheme for dga domain names based on svm. In: Proceedings of the 2018 international conference on Mathematics, Modelling, Simulation and Algorithms (MMSA 2018). Atlantis PressGoogle Scholar
  27. 27.
    Arel I, Rose DC, Karnowski TP et al (2010) Deep machine learning-a new frontier in artificial intelligence research. IEEE Comput Intell Mag 5(4):13–18CrossRefGoogle Scholar
  28. 28.
    Kingma DP, Adam JB (2014) A method for stochastic optimization. CoRR, abs/1412.6980Google Scholar
  29. 29.
    Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. CoRR, abs/1604.05529Google Scholar
  30. 30.
    Dhingra B, Zhou Z, Fitzpatrick D, Muehl M, Cohen WW (2016) Tweet2vec: character-based distributed representations for social media. CoRR, abs/1605.03481Google Scholar
  31. 31.
    Curtin RR, Gardner AB, Grzonkowski S, Kleymenov A, Mosquera A (2019) Detecting DGA domains with recurrent neural networks and side information. In: Proceedings of the 9th ACM conference on data and application security and privacy (in submission to CODASPY’19). ACM, New YorkGoogle Scholar
  32. 32.
    Zhang X, Zhao JZ, LeCun Y (2015) Character-level convolutional networks for text classification. CoRR, abs/1509.01626Google Scholar
  33. 33.
    Saxe J and Berlin K (2017) expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. CoRR, abs/1702.08568Google Scholar
  34. 34.
    Vosoughi S, Roy D (2016) A semi-automatic method for efficient detection of stories on social media. In: Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, GermanyGoogle Scholar
  35. 35.
    Vinayakumar R, Soman KP, Poornachandran P, Menon P A deep-dive on machine learning for cybersecurity use cases. In: Gupta B, Sheng M (eds) Machine learning for computer and cyber security: principle, algorithms, and practices. CRC Press, Boca RatonGoogle Scholar
  36. 36.
    Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) S.P.O.O.F net: syntactic patterns for identification of ominous online factors. In: IEEE symposium on security and privacy workshops. Conference Publishing Services, IEEE Computer Society, Los Alamitos, pp 258–263Google Scholar
  37. 37.
    Vinayakumar R, Soman KP, Poornachandran P, Alazab M, Thampi SM, AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs). In: Big data recommender systems: recent trends and advances, Institution of Engineering and Technology (IET). (In Press)Google Scholar
  38. 38.
    Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE AccessGoogle Scholar
  39. 39.
    Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE AccessGoogle Scholar
  40. 40.
    Vinayakumar R, Alazab M, Jolfaei A, Soman KP, Poornachandran P (2018 December) Ransomware triage using deep learning: twitter as a case study. In the 9th IEEE International Conference on Cyber Security and Communication Systems At: Melbourne, Australia. Springer, 2018. (In Press)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Amara Dinesh Kumar
    • 1
  • Harish Thodupunoori
    • 1
  • R. Vinayakumar
    • 2
  • K. P. Soman
    • 2
  • Prabaharan Poornachandran
    • 3
  • Mamoun Alazab
    • 4
  • Sitalakshmi Venkatraman
    • 5
  1. 1.Department of Electronics and Communication EngineeringAmrita School of Engineering, Amrita Vishwa VidyapeethamCoimbatoreIndia
  2. 2.Centre for Computational Engineering and Networking (CEN)Amrita School of Engineering, Amrita Vishwa VidyapeethamCoimbatoreIndia
  3. 3.Centre for Cyber Security Systems and NetworksAmrita School of Engineering, Amrita Vishwa VidyapeethamAmritapuriIndia
  4. 4.Charles Darwin UniversityCasuarinaAustralia
  5. 5.Melbourne PolytechnicPrahranAustralia

Personalised recommendations