Enhanced Domain Generating Algorithm Detection Based on Deep Neural Networks

Kumar, Amara Dinesh; Thodupunoori, Harish; Vinayakumar, R.; Soman, K. P.; Poornachandran, Prabaharan; Alazab, Mamoun; Venkatraman, Sitalakshmi

doi:10.1007/978-3-030-13057-2_7

Amara Dinesh Kumar¹²,
Harish Thodupunoori¹²,
R. Vinayakumar¹³,
K. P. Soman¹³,
Prabaharan Poornachandran¹⁴,
Mamoun Alazab¹⁵ &
…
Sitalakshmi Venkatraman¹⁶

Part of the book series: Advanced Sciences and Technologies for Security Applications ((ASTSA))

2548 Accesses
8 Citations
3 Altmetric

Abstract

In recent years, modern botnets employ the technique of domain generation algorithm (DGA) to evade detection solutions that use either reverse engineering methods, or blacklisting of malicious domain names. DGA facilitates generation of large number of pseudo random domain names to connect to the command and control server. This makes DGAs very convincing for botnet operators (botmasters) to make their botnets more effective and resilient to blacklisting and efforts of shutting-down attacks. Detecting the malicious domains generated by the DGAs in real time is the most challenging task and significant research has been carried out by applying different machine learning algorithms. This research considers contemporary state-of-the-art DGA malicious detection approaches and proposes a deep learning architecture for detecting the DGA generated domain names.

This chapter presents extensive experiments conducted with various Deep Neural Networks (DNN), mainly, convolutional neural network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Long Short-Term Memory (BiLSTM), Bidirectional Recurrent Neural Network (BiRNN) and CNN-LSTM layers deep learning architectures for the binary class and multi-class detection. An extensive study of the performance and efficiency of the proposed DGA Malicious Detector is conducted through rigorous experimentation and testing of two different datasets. The first dataset consists of public sources and the second dataset is from private sources. We perform a comprehensive measurement study of the DGA by analyzing more than three Million domain names. Our experiments show our DGA Malicious Detector is capable of effectively identifying domains generated by DGA families with high accuracy of 99.7% and 97.1% for the two datasets respectively. A comparative study of the deep learning approaches shows good benchmarking of our DGA Malicious Detector.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/baderj/domain_generation_algorithms
2.
http://osint.bambenekconsulting.com/feeds/
3.
https://data.netlab.360.com/dga/
4.
https://support.alexa.com
5.
https://umbrella.cisco.com/blog
6.
https://github.com/vinayakumarr/DMD2018
7.
Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, or Theano. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible. https://keras.io/
8.
TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks, https://www.tensorflow.org/
9.
https://nlp.amrita.edu/DMD2018
10.
https://github.com/dineshresearch/Real-Time-Character-Level-Malicious-DomainName-Prediction-Using-Deep-Learning

References

Broadhurst R, Grabosky P, Alazab M, Bouhours B, Chon S (2014) An analysis of the nature of groups engaged in cyber crime. Int J Cyber Criminol 8(1):1–20
Google Scholar
Alazab M, Venkatraman S, Watters P, Alazab M, Alazab A (2012) Cybercrime: the case of obfuscated malware. In: Global security, safety and sustainability & e-democracy. Springer, New York, pp 204–211
Chapter Google Scholar
Alazab M, Broadhurst R (2016) Spam and criminal activity, Trends & issues in crime and criminal justice, No. 526. Australian Institute of Criminology, Canberra
Google Scholar
Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw, Elsevier 100:91–102
Article Google Scholar
Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX conference on security symposium (Security’12). USENIX Association, Berkeley, pp 24–24
Google Scholar
Schiavoni S, Maggi F, Cavallaro L, Zanero S (2014) Phoenix: Dgabased botnet tracking and intelligence. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, Cham, pp 192–211
Google Scholar
Lison P, Mavroeidis V (2017) Automatic detection of malware-generated domains with recurrent neural models. arXiv preprint arXiv:1709.07102
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious urls. J Intell Fuzzy Syst 34(3):1333–1343
Article Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell Fuzzy Syst 34(3):1355–1367
Article Google Scholar
Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and email data analysis. J Cyber Secur Mobil 8(2):189–240
Article Google Scholar
Geffner J (2013) End-to-end analysis of a domain generating algorithm malware family. Black Hat USA
Google Scholar
Plohmann D, Yakdan K, Klatt M, Bader J, GerhardsPadilla E, A comprehensive measurement study of domain generating malware
Google Scholar
Pereira M, Coleman S, Yu B, DeCock M, Nascimento A (2018) Dictionary extraction and detection of algorithmically generated domain names in passive dns traffic. In: International symposium on research in attacks, intrusions, and defenses. Springer, pp 295–314
Google Scholar
Baruch M, David G (2018) Domain generation algorithm detection using machine learning methods. Springer, Cham, pp 133–161
Google Scholar
Kuhrer M, Rossow C, Holz T (2014) Paint it black: evaluating the effectiveness of malware blacklists. In: International workshop on recent advances in intrusion detection. Springer, pp 1–21
Google Scholar
Zhauniarovich Y, Khalil I, Yu T, Dacier M (2018) A survey on malicious domains detection through dns data analysis. arXiv preprint arXiv:1805.08426
Google Scholar
Azab A, Alazab M, Aiash M (2016) Machine learning based botnet identification traffic. In: Trustcom/BigDataSE/I SPA, 2016 IEEE. IEEE, pp 1788–1794
Google Scholar
Yadav S, Reddy AKK, Reddy AL, Ranjan S (2010) Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM conference on internet measurement. ACM, New York, pp 48–61
Google Scholar
Yadav S, Reddy AKK, Narasimha Reddy AL, Ranjan S (2012) Detecting algorithmically generated domain-flux attacks with dns traffic analysis. IEEE/ACM Trans Networking 20(5):1663–1677
Article Google Scholar
Woodbridge J, Anderson HS, Ahuja A, Grant D (2016) Predicting domain generation algorithms with long short-term memory networks. CoRR, abs/1611.00791
Google Scholar
Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy S, Samui P, Deo R, Ntalampiras S (eds) Big data in engineering applications. Studies in big data, vol 44. Springer, Singapore, pp 113–142
Chapter Google Scholar
Stone-Gross B, Cova M, Cavallaro L, Gilbert B, Szydlowski M, Kemmerer R, Kruegel C, Vigna G (2009) Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM conference on Computer and Communications Security, CCS ‘09. ACM, New York, pp 635–647
Google Scholar
Bilge L, Sen S, Balzarotti D, Kirda E, Kruegel C (2014) Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans Inf Syst Secur 16(4):1–28. https://doi.org/10.1145/2584679
Article Google Scholar
Zhou Y, Li Q-s, Miao Q, Yim K (2013) Dga-based botnet detection using dns traffic. J Internet Serv Inf Secur 3(3/4):116–123
Google Scholar
Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. (2003) A practical guide to support vector classification
Google Scholar
Wang Z, Jia Z, Zhang B (2018) A detection scheme for dga domain names based on svm. In: Proceedings of the 2018 international conference on Mathematics, Modelling, Simulation and Algorithms (MMSA 2018). Atlantis Press
Google Scholar
Arel I, Rose DC, Karnowski TP et al (2010) Deep machine learning-a new frontier in artificial intelligence research. IEEE Comput Intell Mag 5(4):13–18
Article Google Scholar
Kingma DP, Adam JB (2014) A method for stochastic optimization. CoRR, abs/1412.6980
Google Scholar
Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. CoRR, abs/1604.05529
Google Scholar
Dhingra B, Zhou Z, Fitzpatrick D, Muehl M, Cohen WW (2016) Tweet2vec: character-based distributed representations for social media. CoRR, abs/1605.03481
Google Scholar
Curtin RR, Gardner AB, Grzonkowski S, Kleymenov A, Mosquera A (2019) Detecting DGA domains with recurrent neural networks and side information. In: Proceedings of the 9th ACM conference on data and application security and privacy (in submission to CODASPY’19). ACM, New York
Google Scholar
Zhang X, Zhao JZ, LeCun Y (2015) Character-level convolutional networks for text classification. CoRR, abs/1509.01626
Google Scholar
Saxe J and Berlin K (2017) expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. CoRR, abs/1702.08568
Google Scholar
Vosoughi S, Roy D (2016) A semi-automatic method for efficient detection of stories on social media. In: Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, Germany
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P, Menon P A deep-dive on machine learning for cybersecurity use cases. In: Gupta B, Sheng M (eds) Machine learning for computer and cyber security: principle, algorithms, and practices. CRC Press, Boca Raton
Google Scholar
Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) S.P.O.O.F net: syntactic patterns for identification of ominous online factors. In: IEEE symposium on security and privacy workshops. Conference Publishing Services, IEEE Computer Society, Los Alamitos, pp 258–263
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P, Alazab M, Thampi SM, AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs). In: Big data recommender systems: recent trends and advances, Institution of Engineering and Technology (IET). (In Press)
Google Scholar
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access
Google Scholar
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access
Google Scholar
Vinayakumar R, Alazab M, Jolfaei A, Soman KP, Poornachandran P (2018 December) Ransomware triage using deep learning: twitter as a case study. In the 9th IEEE International Conference on Cyber Security and Communication Systems At: Melbourne, Australia. Springer, 2018. (In Press)
Google Scholar

Download references

Acknowledgement

This work was supported by the Department of Corporate and Information Services, Northern Territory Government of Australia, the Paramount Computer Systems, and Lakhshya Cyber Security Labs. We would like to thank NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
Amara Dinesh Kumar & Harish Thodupunoori
Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
R. Vinayakumar & K. P. Soman
Centre for Cyber Security Systems and Networks, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India
Prabaharan Poornachandran
Charles Darwin University, Casuarina, NT, Australia
Mamoun Alazab
Melbourne Polytechnic, Prahran, Australia
Sitalakshmi Venkatraman

Authors

Amara Dinesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Harish Thodupunoori
View author publications
You can also search for this author in PubMed Google Scholar
R. Vinayakumar
View author publications
You can also search for this author in PubMed Google Scholar
K. P. Soman
View author publications
You can also search for this author in PubMed Google Scholar
Prabaharan Poornachandran
View author publications
You can also search for this author in PubMed Google Scholar
Mamoun Alazab
View author publications
You can also search for this author in PubMed Google Scholar
Sitalakshmi Venkatraman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Charles Darwin University, Casuarina, NT, Australia
Mamoun Alazab
Singtel Optus, Sydney, NSW, Australia
MingJian Tang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kumar, A.D. et al. (2019). Enhanced Domain Generating Algorithm Detection Based on Deep Neural Networks. In: Alazab, M., Tang, M. (eds) Deep Learning Applications for Cyber Security. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-13057-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-13057-2_7
Published: 15 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13056-5
Online ISBN: 978-3-030-13057-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics