Abstract
In recent years, modern botnets employ the technique of domain generation algorithm (DGA) to evade detection solutions that use either reverse engineering methods, or blacklisting of malicious domain names. DGA facilitates generation of large number of pseudo random domain names to connect to the command and control server. This makes DGAs very convincing for botnet operators (botmasters) to make their botnets more effective and resilient to blacklisting and efforts of shutting-down attacks. Detecting the malicious domains generated by the DGAs in real time is the most challenging task and significant research has been carried out by applying different machine learning algorithms. This research considers contemporary state-of-the-art DGA malicious detection approaches and proposes a deep learning architecture for detecting the DGA generated domain names.
This chapter presents extensive experiments conducted with various Deep Neural Networks (DNN), mainly, convolutional neural network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Long Short-Term Memory (BiLSTM), Bidirectional Recurrent Neural Network (BiRNN) and CNN-LSTM layers deep learning architectures for the binary class and multi-class detection. An extensive study of the performance and efficiency of the proposed DGA Malicious Detector is conducted through rigorous experimentation and testing of two different datasets. The first dataset consists of public sources and the second dataset is from private sources. We perform a comprehensive measurement study of the DGA by analyzing more than three Million domain names. Our experiments show our DGA Malicious Detector is capable of effectively identifying domains generated by DGA families with high accuracy of 99.7% and 97.1% for the two datasets respectively. A comparative study of the deep learning approaches shows good benchmarking of our DGA Malicious Detector.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, or Theano. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible. https://keras.io/
- 8.
TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks, https://www.tensorflow.org/
- 9.
- 10.
References
Broadhurst R, Grabosky P, Alazab M, Bouhours B, Chon S (2014) An analysis of the nature of groups engaged in cyber crime. Int J Cyber Criminol 8(1):1–20
Alazab M, Venkatraman S, Watters P, Alazab M, Alazab A (2012) Cybercrime: the case of obfuscated malware. In: Global security, safety and sustainability & e-democracy. Springer, New York, pp 204–211
Alazab M, Broadhurst R (2016) Spam and criminal activity, Trends & issues in crime and criminal justice, No. 526. Australian Institute of Criminology, Canberra
Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw, Elsevier 100:91–102
Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX conference on security symposium (Security’12). USENIX Association, Berkeley, pp 24–24
Schiavoni S, Maggi F, Cavallaro L, Zanero S (2014) Phoenix: Dgabased botnet tracking and intelligence. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, Cham, pp 192–211
Lison P, Mavroeidis V (2017) Automatic detection of malware-generated domains with recurrent neural models. arXiv preprint arXiv:1709.07102
Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious urls. J Intell Fuzzy Syst 34(3):1333–1343
Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell Fuzzy Syst 34(3):1355–1367
Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and email data analysis. J Cyber Secur Mobil 8(2):189–240
Geffner J (2013) End-to-end analysis of a domain generating algorithm malware family. Black Hat USA
Plohmann D, Yakdan K, Klatt M, Bader J, GerhardsPadilla E, A comprehensive measurement study of domain generating malware
Pereira M, Coleman S, Yu B, DeCock M, Nascimento A (2018) Dictionary extraction and detection of algorithmically generated domain names in passive dns traffic. In: International symposium on research in attacks, intrusions, and defenses. Springer, pp 295–314
Baruch M, David G (2018) Domain generation algorithm detection using machine learning methods. Springer, Cham, pp 133–161
Kuhrer M, Rossow C, Holz T (2014) Paint it black: evaluating the effectiveness of malware blacklists. In: International workshop on recent advances in intrusion detection. Springer, pp 1–21
Zhauniarovich Y, Khalil I, Yu T, Dacier M (2018) A survey on malicious domains detection through dns data analysis. arXiv preprint arXiv:1805.08426
Azab A, Alazab M, Aiash M (2016) Machine learning based botnet identification traffic. In: Trustcom/BigDataSE/I SPA, 2016 IEEE. IEEE, pp 1788–1794
Yadav S, Reddy AKK, Reddy AL, Ranjan S (2010) Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM conference on internet measurement. ACM, New York, pp 48–61
Yadav S, Reddy AKK, Narasimha Reddy AL, Ranjan S (2012) Detecting algorithmically generated domain-flux attacks with dns traffic analysis. IEEE/ACM Trans Networking 20(5):1663–1677
Woodbridge J, Anderson HS, Ahuja A, Grant D (2016) Predicting domain generation algorithms with long short-term memory networks. CoRR, abs/1611.00791
Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy S, Samui P, Deo R, Ntalampiras S (eds) Big data in engineering applications. Studies in big data, vol 44. Springer, Singapore, pp 113–142
Stone-Gross B, Cova M, Cavallaro L, Gilbert B, Szydlowski M, Kemmerer R, Kruegel C, Vigna G (2009) Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM conference on Computer and Communications Security, CCS ‘09. ACM, New York, pp 635–647
Bilge L, Sen S, Balzarotti D, Kirda E, Kruegel C (2014) Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans Inf Syst Secur 16(4):1–28. https://doi.org/10.1145/2584679
Zhou Y, Li Q-s, Miao Q, Yim K (2013) Dga-based botnet detection using dns traffic. J Internet Serv Inf Secur 3(3/4):116–123
Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. (2003) A practical guide to support vector classification
Wang Z, Jia Z, Zhang B (2018) A detection scheme for dga domain names based on svm. In: Proceedings of the 2018 international conference on Mathematics, Modelling, Simulation and Algorithms (MMSA 2018). Atlantis Press
Arel I, Rose DC, Karnowski TP et al (2010) Deep machine learning-a new frontier in artificial intelligence research. IEEE Comput Intell Mag 5(4):13–18
Kingma DP, Adam JB (2014) A method for stochastic optimization. CoRR, abs/1412.6980
Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. CoRR, abs/1604.05529
Dhingra B, Zhou Z, Fitzpatrick D, Muehl M, Cohen WW (2016) Tweet2vec: character-based distributed representations for social media. CoRR, abs/1605.03481
Curtin RR, Gardner AB, Grzonkowski S, Kleymenov A, Mosquera A (2019) Detecting DGA domains with recurrent neural networks and side information. In: Proceedings of the 9th ACM conference on data and application security and privacy (in submission to CODASPY’19). ACM, New York
Zhang X, Zhao JZ, LeCun Y (2015) Character-level convolutional networks for text classification. CoRR, abs/1509.01626
Saxe J and Berlin K (2017) expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. CoRR, abs/1702.08568
Vosoughi S, Roy D (2016) A semi-automatic method for efficient detection of stories on social media. In: Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, Germany
Vinayakumar R, Soman KP, Poornachandran P, Menon P A deep-dive on machine learning for cybersecurity use cases. In: Gupta B, Sheng M (eds) Machine learning for computer and cyber security: principle, algorithms, and practices. CRC Press, Boca Raton
Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) S.P.O.O.F net: syntactic patterns for identification of ominous online factors. In: IEEE symposium on security and privacy workshops. Conference Publishing Services, IEEE Computer Society, Los Alamitos, pp 258–263
Vinayakumar R, Soman KP, Poornachandran P, Alazab M, Thampi SM, AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs). In: Big data recommender systems: recent trends and advances, Institution of Engineering and Technology (IET). (In Press)
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access
Vinayakumar R, Alazab M, Jolfaei A, Soman KP, Poornachandran P (2018 December) Ransomware triage using deep learning: twitter as a case study. In the 9th IEEE International Conference on Cyber Security and Communication Systems At: Melbourne, Australia. Springer, 2018. (In Press)
Acknowledgement
This work was supported by the Department of Corporate and Information Services, Northern Territory Government of Australia, the Paramount Computer Systems, and Lakhshya Cyber Security Labs. We would like to thank NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kumar, A.D. et al. (2019). Enhanced Domain Generating Algorithm Detection Based on Deep Neural Networks. In: Alazab, M., Tang, M. (eds) Deep Learning Applications for Cyber Security. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-13057-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-13057-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13056-5
Online ISBN: 978-3-030-13057-2
eBook Packages: Computer ScienceComputer Science (R0)