Malicious Software Family Classification using Machine Learning Multi-class Classifiers

San, Cho Cho; Thwin, Mie Mie Su; Htun, Naing Linn

doi:10.1007/978-981-13-2622-6_41

Cho Cho San³⁸,
Mie Mie Su Thwin³⁸ &
Naing Linn Htun³⁸

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 481))

1922 Accesses
6 Citations

Abstract

Due to the rapid growth of targeted malware attacks, malware analysis and family classification are important for all types of users such as personal, enterprise, and government. Traditional signature-based malware detection and anti-virus systems fail to classify the new variants of unknown malware into their corresponding families. Therefore, we propose malware family classification system for 11 malicious families by extracting their prominent API features from the reports of enhanced and scalable version of cuckoo sandbox. Moreover, the proposed system contributes feature extraction algorithm, feature reduction and representation procedure for identifying and representing the extracted feature attributes. To classify the different types of malicious software Random Forest (RF), K-Nearest Neighbor (KNN), and Decision Table (DT) machine learning multi-class classifiers have been used in this system and RF and KNN classifiers provide 95.8% high accuracy in malware family classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Internet Security Threat Report, Volume 22, Symantec (April 2017)
Google Scholar
Yin, H., Song, D.: Automatic Malware Analysis: An Emulator Based Approach, Springer-Briefs in Computer Science, http://doi.org/10.1007/978-1-4614-5523-37(2013)
Salehi, Z., Ghiasi, M., Sami, A.: A miner for malware detection based on API functioncalls and their arguments, In: Artificial Intelligence and Signal Processing (AISP), 16^th CSI International Symposium on, pp. 563–568 (May 2012)
Google Scholar
Uppal, D., Sinha, R., Mehra, V., Jain, V.: Malware detection and classification based onextraction of api sequences, In: International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2337–2342 (September 2014)
Google Scholar
R. Tian, R. Islam, L. Batten, and Versteeg, S.: Differentiating malware from cleanware using behavioural analysis, Malicious and Unwanted Software (MALWARE), 5th International Conference on, vol. 5, no. 5, pp. 23–30 (2010)
Google Scholar
Dennis Distler, Malware Analysis: An Introduction, SANS Institute, (December 14, 2007)
Google Scholar
Ahmadi, Mansour, Dmitry, U., Stanislav, S., Mikhail, T., Giorgio, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 183-194. ACM (2016)
Google Scholar
Kohavi, R.: The power of decision tables. Machine learning: ECML-95, 174-189, (1995).
Google Scholar
Kawaguchi, N., Omote, K.: Malware function classification using APIs in initial behavior. In: Information Security (AsiaJCIS), 10th Asia Joint Conference on, pp. 138-144. IEEE, (2015)
Google Scholar
Qi, Y.: Random Forest for bioinformatics, http://www.cs.cmu.edu/
Hansen, Steven, S., Thor Mark Tampus, L., Matija, S., Jens Myrup, P.: An approach fordetection and family classification of malware based on behavioral analysis. In Computing, Networking and Communications (ICNC), International Conference on, pp. 1-5. IEEE, (2016)
Google Scholar
Hong, J., Park, S., Kim, SW.: On exploiting static and dynamic features in malware classification. In: International Conference on Big Data Technologies and Applications (pp. 122-129). Springer, Cham (Nov 17 2016)
Google Scholar
Ranveer, S., Hiray, S.: Comparative analysis of feature extraction methods of malware detection, International Journal of Computer Applications. 120(5) (Jan 1 2015)
Google Scholar
Pirscoveanu, Radu, S., Steven Hansen, S., Thor MT, L., Matija, S., Jens Myrup, P., Alexandre, C.: Analysis of malware behavior: Type classification using machine learning. In Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), International Conference on, pp. 1-7. IEEE, (2015)
Google Scholar
S. Gupta, H. Sharma, S. Kaur, Malware characterization using windows API calls sequences, In: International Conference on Security, Privacy, and Applied Cryptography Engineering, Springer, Cham, pp. 271-280, (2016 Dec 14)
Google Scholar
TM. Mitchell, Machine learning. WCB. (1997).
Google Scholar

Download references

Author information

Authors and Affiliations

Cyber Security Research Lab, University of Computer Studies, Yangon, Myanmar
Cho Cho San, Mie Mie Su Thwin & Naing Linn Htun

Authors

Cho Cho San
View author publications
You can also search for this author in PubMed Google Scholar
Mie Mie Su Thwin
View author publications
You can also search for this author in PubMed Google Scholar
Naing Linn Htun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cho Cho San .

Editor information

Editors and Affiliations

Knowledge Technology Research Unit, Faculty of Computing and Informatics, Universiti Malaysia Sabah, Kota Kinabalu, Malaysia
Rayner Alfred
School of Information Science, Security and Networks Area, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Yuto Lim
Faculty of Computing and Informatics, Universiti Malaysia Sabah Faculty of Computing and Informatics, Kota Kinabalu, Sabah, Malaysia
Ag Asri Ag Ibrahim
Lincoln University , Christchurch, New Zealand
Patricia Anthony

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

San, C.C., Thwin, M.M.S., Htun, N.L. (2019). Malicious Software Family Classification using Machine Learning Multi-class Classifiers. In: Alfred, R., Lim, Y., Ibrahim, A., Anthony, P. (eds) Computational Science and Technology. Lecture Notes in Electrical Engineering, vol 481. Springer, Singapore. https://doi.org/10.1007/978-981-13-2622-6_41

Download citation

DOI: https://doi.org/10.1007/978-981-13-2622-6_41
Published: 28 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2621-9
Online ISBN: 978-981-13-2622-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics