Abstract
To combat exponentially evolved modern malware, an effective Malware Detection System and precise malware classification is highly essential. In this paper, the Linear Support Vector Classification (LSVC) recommended Hybrid Features based Malware Detection System (HF-MDS) has been proposed. It uses a combination of the static and dynamic features of the Portable Executable (PE) files as hybrid features to identify unknown malware. The application program interface calls invoked by the PE files during their execution along with their correspondent category are collected and considered as dynamic features from the PE file behavioural report produced by the Cuckoo Sandbox. The PE files’ header details such as optional header, disk operating system header, and file header are treated as static features. The LSVC is used as a feature selector to choose prominent static and dynamic features from their respective Original Feature Space. The features recommended by the LSVC are highly discriminative and used as final features for the classification process. Different sets of experiments were conducted using real-world malware samples to verify the combination of static and dynamic features, which encourage the classifier to attain high accuracy. The tenfold cross-validation experimental results demonstrate that the proposed HF-MDS is proficient in precisely detecting malware and benign PE files by attaining detection accuracy of 99.743% with sequential minimal optimization classifier consisting of hybrid features.
Similar content being viewed by others
References
Alam, S., Traore, I., Sogukpinar, I.: Annotated control flow graph for metamorphic malware detection. Comput. J. 58(10), 2608–2621 (2014)
Allix, K., Bissyandé, T.F., Jérome, Q., Klein, J., Le Traon, Y.: Empirical assessment of machine learning-based malware detectors for android. Empir. Softw. Eng. 21(1), 183–211 (2016)
Amin, M.: A survey of financial losses due to malware. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, p. 145. ACM (2016)
Awan, S., Saqib, N.A.: Detection of malicious executables using static and dynamic features of portable executable (pe) file. In: Security, Privacy and Anonymity in Computation, Communication and Storage: SpaCCS 2016 International Workshops, TrustData, TSP, NOPE, DependSys, BigDataSPT, and WCSSC, Zhangjiajie, China, November 16–18, 2016, Proceedings 9, pp. 48–58. Springer, New York (2016)
Bai, J., Wang, J., Zou, G.: A malware detection scheme based on mining format information. Sci. World J. 2014 (2014)
Bayer, U., Moser, A., Kruegel, C., Kirda, E.: Dynamic analysis of malicious code. J. Comput. Virol. 2(1), 67–77 (2006)
Bazrafshan, Z., Hashemi, H., Fard, S.M.H., Hamzeh, A.: A survey on heuristic malware detection techniques. In: Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120. IEEE (2013)
Belaoued, M., Mazouzi, S.: A real-time pe-malware detection system based on chi-square test and pe-file features. In: IFIP International Conference on Computer Science and its Applications __, pp. 416–425. Springer, New York (2015)
Bounouh, T., Brahimi, Z., Al-Nemrat, A., Benzaid, C.: A scalable malware classification based on integrated static and dynamic features. In: International Conference on Global Security, Safety, and Sustainability, pp. 113–124. Springer, New York (2017)
Calleja, A., Tapiador, J., Caballero, J.: A look into 30 years of malware development from a software metrics perspective. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 325–345. Springer, New York (2016)
Das, S., Liu, Y., Zhang, W., Chandramohan, M.: Semantics-based online malware detection: towards efficient real-time protection against malware. IEEE Trans. Inf. Forensics Secur. 11(2), 289–302 (2016)
Firdausi, I., Erwin, A., Nugroho, A.S., et al.: Analysis of machine learning techniques used in behavior-based malware detection. In: Advances in Computing, Control and Telecommunication Technologies (ACT), 2010 Second International Conference on, pp. 201–203. IEEE (2010)
Gandotra, E., Bansal, D., Sofat, S.: Integrated framework for classification of malwares. In: Proceedings of the 7th International Conference on Security of Information and Networks, p. 417. ACM (2014)
Guarnieri, C., Tanasi, A., Bremer, J., Schloesser, M.: Automated malware analysis-cuckoo sandbox (2012)
Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)
Kawaguchi, N., Omote, K.: Malware function classification using APIs in initial behavior. In: Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, pp. 138–144. IEEE (2015)
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Kumar, A., Kuppusamy, K., Aghila, G.: A learning model to detect maliciousness of portable executable using integrated feature set. J. King Saud Univ.-Comput. Inf. Sci. (2017)
Lengyel, T.K., Maresca, S., Payne, B.D., Webster, G.D., Vogl, S., Kiayias, A.: Scalability, fidelity and stealth in the drakvuf dynamic malware analysis system. In: Proceedings of the 30th Annual Computer Security Applications Conference, pp. 386–395. ACM (2014)
Masud, M.M., Khan, L., Thuraisingham, B.: A scalable multi-level feature extraction technique to detect malicious executables. Inf. Syst. Front. 10(1), 33–45 (2008)
Miller, C., Glendowne, D., Cook, H., Thomas, D., Lanclos, C., Pape, P.: Insights gained from constructing a large scale dynamic analysis platform. Digit. Investig. 22, S48–S56 (2017)
Mohaisen, A., Alrawi, O., Mohaisen, M.: Amal: high-fidelity, behavior-based automated malware analysis and classification. Comput. Secur. 52, 251–266 (2015)
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Computer Security Applications Conference, 2007. ACSAC 2007. Twenty-Third Annual, pp. 421–430. IEEE (2007)
Noor, M., Abbas, H., Shahid, W.B.: Countering cyber threats for industrial applications: an automated approach for malware evasion detection and analysis. J. Netw. Comput. Appl. (2017)
Pekalska, E., Duin, R.P.: Classifiers for dissimilarity-based pattern recognition. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on, vol. 2, pp. 12–16. IEEE (2000)
Pektaş, A., Eriş, M., Acarman, T.: Proposal of n-gram based algorithm for malware classification. In: The Fifth International Conference on Emerging Security Information, Systems and Technologies, pp. 7–13 (2011)
Qiao, Y., Yang, Y., He, J., Tang, C., Liu, Z.: CBM: free, automatic malware analysis framework using API call sequences. In: Knowledge Engineering and Management, pp. 225–236. Springer, New York (2014)
Reddy, D.K.S., Pujari, A.K.: N-gram analysis for computer virus detection. J. Comput. Virol. 2(3), 231–239 (2006)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Saleh, M., Li, T., Xu, S.: Multi-context features for detecting malicious programs. J. Comput. Virol. Hacking Tech., pp. 1–13 (2017)
Salehi, Z., Sami, A., Ghiasi, M.: Maar: Robust features to detect malicious activity based on api calls, their arguments and return values. Eng. Appl. Artif. Intell. 59, 93–102 (2017)
Santos, I., Brezo, F., Nieves, J., Penya, Y.K., Sanz, B., Laorden, C., Bringas, P.G.: Idea: Opcode-sequence-based malware detection. In: International Symposium on Engineering Secure Software and Systems, pp. 35–43. Springer, New York (2010)
Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)
Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static-dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE 12-SOCO 12 Special Sessions, pp. 271–280. Springer, New York (2013)
Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: DCAI, pp. 415–422. Springer, New York (2011)
Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, pp. 38–49. IEEE (2001)
Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)
Shahzad, R.K., Haider, S.I., Lavesson, N.: Detection of spyware by mining executable files. In: Availability, Reliability, and Security, 2010. ARES’10 International Conference on, pp. 295–302. IEEE (2010)
Sharma, A., Sahay, S.K.: Evolution and detection of polymorphic and metamorphic malwares: a survey. arXiv preprint arXiv:1406.7061 (2014)
Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Procedia Comput. Sci. 46, 804–811 (2015)
Touchette, F.: The evolution of malware. Netw. Secur. 2016(1), 11–14 (2016)
Vinod, P., Laxmi, V., Gaur, M.S.: Scattered feature space for malware analysis. In: Advances in Computing and Communications, pp. 562–571 (2011)
Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using cwsandbox. IEEE Security and Privacy 5(2) (2007)
Ye, Y., Chen, L., Wang, D., Li, T., Jiang, Q., Zhao, M.: SBMDS: an interpretable string based malware detection system using svm ensemble with bagging. J. Comput. Virol. 5(4), 283–293 (2009)
Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surv. 50(3), 41 (2017)
Ye, Y., Li, T., Chen, Y., Jiang, Q.: Automatic malware categorization using cluster ensemble. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95–104. ACM (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shiva Darshan, S.L., Jaidhar, C.D. Windows malware detection system based on LSVC recommended hybrid features. J Comput Virol Hack Tech 15, 127–146 (2019). https://doi.org/10.1007/s11416-018-0327-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-018-0327-9