Abstract
The paper presents the application justification of a new set of features collected at the stage of the static analysis of the executable files to address the problem of malicious code detection. In the course of study the following problems were solved: the development of the executable files classifier in the absence of a priori data concerning their functionality; designing the class models of uninfected files and malware during the learning process; the development of malicious code detection procedure using the neural networks mathematical apparatus and decision tree composition relating to the set of features specified on the basis of the executable files static analysis. The paper contains the results of experimental evaluation of the developed detection mechanism efficiency on the basis of neural networks (accuracy was 0.99125) and decision tree composition (accuracy was 0.99240). The obtained data confirmed the hypothesis about the possibility of constructing the heuristic malware analyzer on the basis of features selected during the static analysis of the executable files.
Similar content being viewed by others
References
AV-Comparatives: malware protection test. https://www.av-comparatives.org/wp-content/uploads/2017/04/avc_mpt_201703_en.pdf (2017)
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., Varoquaux, G.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
David, B., Filiol, E., Gallienne, K.: Structural analysis of binary executable headers for malware detection optimization. J. Comput. Virol. Hacking Tech. 13(2), 87–93 (2017). https://doi.org/10.1007/s11416-016-0274-2
Federal Service for Technology and Export Control: Informational report on antivirus software requirements approval (2012) (in Russian)
Kingma, D., Adam, J.B.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kozachok, A.V.: Mathematical model of destructive software recognition tools based on hidden markov models. Vestnik SibGUTI 3, 29–39 (2012). (in Russian)
Ochsenmeier, M.: Pestudio—malware initial assesment https://www.winitor.com/features.html (2017)
Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static–dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE’ 12-SOCO’ 12 Special Sessions, pp. 271–280. Springer, Berlin (2013)
Schmid, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. UMIST, Manchester (1994)
Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Information Security Technical Report 14(1), 16–29 (2009). https://doi.org/10.1016/j.istr.2009.03.003. http://www.sciencedirect.com/science/article/pii/S1363412709000041
Shi, T., Horvath, S.: Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 15(1), 118–138 (2006)
Siddiqui, M., Wang, M.C., Lee, J.: A survey of data mining techniques for malware detection using file features. In: Proceedings of the 46th Annual Southeast Regional Conference on XX, ACM-SE 46, pp. 509–510. ACM, New York (2008). https://doi.org/10.1145/1593105.1593239
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014). http://dl.acm.org/citation.cfm?id=2627435.2670313
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kozachok, A.V., Kozachok, V.I. Construction and evaluation of the new heuristic malware detection mechanism based on executable files static analysis. J Comput Virol Hack Tech 14, 225–231 (2018). https://doi.org/10.1007/s11416-017-0309-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-017-0309-3