Classification of Malware Families Based on Runtime Behaviour

Geden, Munir; Happa, Jassim

doi:10.1007/978-3-030-01689-0_3

Munir Geden¹⁷ &
Jassim Happa¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11161))

Included in the following conference series:

International Symposium on Cyberspace Safety and Security

2182 Accesses
3 Citations

Abstract

This paper distinguishes malware families from a specific category (i.e., ransomware) via dynamic analysis. We collect samples from four ransomware families and use Cuckoo sandbox environment, to observe their runtime behaviour. This study aims to provide new insight into malware family classification by comparing possible runtime features, and application of different extraction and selection techniques on them. As we try many extraction models on call traces such as bag-of-words, ngram sequences and wildcard patterns, we also look for other behavioural features such as files, registry and mutex artefacts. While wildcard patterns on call traces are designed to overcome advanced evasion strategies such as the insertion of junk API calls (causing ngram searches to fail), for the models generating too many features, we adapt new feature selection techniques with a classwise fashion to avoid unfair representation of families in the feature set which leads to poor detection performance. To our knowledge, no research paper has applied a classwise approach to the multi-class malware family identification. With a 96.05% correct classification ratio for four families, this study outperforms most studies applying similar techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

11 of the worst ransomware - we name the internet nastiest extortion malware - Gallery - Computerworld UK. https://goo.gl/wNDoL4
Cuckoo Sandbox: Automated Malware Analysis. https://cuckoosandbox.org/
Hunting the Mutex - Palo Alto Networks Blog. https://researchcenter.paloaltonetworks.com/2014/08/hunting-mutex/
TrendLabs Security Intelligence BlogPOWELIKS: Malware Hides In Windows Registry - TrendLabs Security Intelligence Blog. https://goo.gl/3nrgo7
Abou-Assaleh, T., Cercone, N., Keselj, V., Sweidan, R.: N-gram-based detection of new malicious code. In: Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004, COMPSAC 2004. vol. 2, pp. 41–42. IEEE (2004). https://doi.org/10.1109/CMPSAC.2004.1342667
Bayer, U., Kruegel, C., Kirda, E.: TTAnalyze: A tool for analyzing malware. In: 15th Annual Conference on European Institute for Computer Antivirus Research, pp. 180–192 (2006)
Google Scholar
Canali, D., Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: A quantitative study of accuracy in system call-based malware detection. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis - ISSTA 2012, p. 122 (2012). https://doi.org/10.1145/2338965.2336768
Fukushima, Y., Sakai, A., Hori, Y., Sakurai, K.: A behavior based malware detection scheme for avoiding false positive. 2010 6th IEEE Workshop on Secure Network Protocols (NPSec), pp. 79–84 (2010)
Google Scholar
Geden, M.: Ngram and signature based malware detection in android platform. Msc dissertation, University College London (2015). https://goo.gl/uKJsHv
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. ACM SIGKDD Explor. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278
Article Google Scholar
Hansen, S.S., Larsen, T.M.T., Stevanovic, M., Pedersen, J.M.: An approach for detection and family classification of malware based on behavioral analysis. In: 2016 International Conference on Computing, Networking and Communications, ICNC 2016, pp. 1–5. IEEE (2016). https://doi.org/10.1109/ICCNC.2016.7440587
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006). https://doi.org/10.1002/asi.20427
Article MathSciNet MATH Google Scholar
McAfee: McAfee Labs Threats Report March (2018). https://goo.gl/ZeugSV
Nair, V.P., Jain, H., Golecha, Y.K., Gaur, M.S., Laxmi, V.: MEDUSA: MEtamorphic malware dynamic analysis using signature from API. In: Proceedings of the 3rd International Conference on Security of Information and Networks - SIN 2010 (January), p. 263 (2010). https://doi.org/10.1145/1854099.1854152
Pirscoveanu, R., Hansen, S.S., Larsen, T., Stevanovic, M. Pedersen, J., Czech, A.: Analysis of malware behavior: type classification using machine learning. In: International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), pp. 1–7 (2015). https://doi.org/10.1109/CyberSA.2015.7166128
Reddy, D.K.S., Pujari, A.K.: N-gram analysis for computer virus detection. J. Comput. Virol. 2(3), 231–239 (2006)
Article Google Scholar
Salehi, Z., Ghiasi, M., Sami, A.: A miner for malware detection based on API function calls and their arguments. In: The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), pp. 563–568. IEEE, May 2012. https://doi.org/10.1109/AISP.2012.6313810
Sami, A., Yadegari, B., Peiravian, N., Hashemi, S., Hamze, A.: Malware detection based on mining API calls. In: Proceedings of the 2010 ACM Symposium on Applied Computing - SAC 2010, p. 1020 (2010). https://doi.org/10.1145/1774088.1774303
Schultz, M., Eskin, E., Zadok, F., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001, pp. 38–49. IEEE Computer Society (2001). https://doi.org/10.1109/SECPRI.2001.924286
Sebastián, M., Rivera, R., Kotzias, P., Caballero, J.: AVclass: A tool for massive malware labeling. In: Monrose, F., Dacier, M., Blanc, G., Garcia-Alfaro, J. (eds.) RAID 2016. LNCS, vol. 9854, pp. 230–253. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45719-2_11
Chapter Google Scholar
Shabtai, A., Fledel, Y., Elovici, Y.: Automated static code analysis for classifying android applications using machine learning. In: Proceedings - 2010 International Conference on Computational Intelligence and Security, CIS 2010, pp. 329–333 (2010). https://doi.org/10.1109/CIS.2010.77
Tsyganok, K., Tumoyan, E., Babenko, L., Anikeev, M.: Classification of polymorphic and metamorphic malware samples based on their behavior. In: Proceedings of the Fifth International Conference on Security of Information and Networks - SIN 2012, pp. 111–116 (2012). https://doi.org/10.1145/2388576.2388591
Uppal, D., Sinha, R., Mehra, V., Jain, V.: Malware detection and classification based on extraction of API sequences. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2337–2342. IEEE, September 2014. https://doi.org/10.1109/ICACCI.2014.6968547
Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using CWSandbox. IEEE Secur. Priv. Mag. 5(2), 32–39 (2007). https://doi.org/10.1109/MSP.2007.45
Article Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Machine Learning-International Workshop Then Conference, pp. 412–420 (1997). https://doi.org/10.1093/bioinformatics/bth267
Article Google Scholar
Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q.: An intelligent PE-malware detection system based on association mining. J. Comput. Virol. 4(4), 323–334 (2008). https://doi.org/10.1007/s11416-008-0082-4
Article Google Scholar
Yerima, S.Y., Sezer, S., McWilliams, G.: Analysis of Bayesian classification-based approaches for android malware detection. IET Inf. Secur. 8(1), 25–36 (2014). https://doi.org/10.1049/iet-ifs.2013.0095
Article Google Scholar
Zhang, P., Tan, Y.: Class-wise information gain. In: 2013 IEEE Third International Conference on Information Science and Technology (ICIST), pp. 972–978. IEEE, March 2013. https://doi.org/10.1109/ICIST.2013.6747700

Download references

Acknowledgements

We want to thank VirusTotal community for providing a private API to our research that enabled us to search for and download the ransomware samples.

Cuckoo reports (1.4GB) of the samples and framework’s source code: Reports: https://goo.gl/e8jbXq

Source code: https://bitbucket.org/msgeden/familyclassifier

Author information

Authors and Affiliations

Department of Computer Science, University of Oxford, Oxford, UK
Munir Geden & Jassim Happa

Authors

Munir Geden
View author publications
You can also search for this author in PubMed Google Scholar
Jassim Happa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Munir Geden .

Editor information

Editors and Affiliations

University of Salerno, Fisciano, Italy
Arcangelo Castiglione
University Politehnica of Bucharest, Bucharest, Romania
Florin Pop
University of Campania “L. Vanvitelli”, Caserta, Italy
Massimo Ficco
University of Salerno, Fisciano, Italy
Francesco Palmieri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geden, M., Happa, J. (2018). Classification of Malware Families Based on Runtime Behaviour. In: Castiglione, A., Pop, F., Ficco, M., Palmieri, F. (eds) Cyberspace Safety and Security. CSS 2018. Lecture Notes in Computer Science(), vol 11161. Springer, Cham. https://doi.org/10.1007/978-3-030-01689-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-01689-0_3
Published: 23 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01688-3
Online ISBN: 978-3-030-01689-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics