Skip to main content
Log in

Windows malware detection system based on LSVC recommended hybrid features

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

To combat exponentially evolved modern malware, an effective Malware Detection System and precise malware classification is highly essential. In this paper, the Linear Support Vector Classification (LSVC) recommended Hybrid Features based Malware Detection System (HF-MDS) has been proposed. It uses a combination of the static and dynamic features of the Portable Executable (PE) files as hybrid features to identify unknown malware. The application program interface calls invoked by the PE files during their execution along with their correspondent category are collected and considered as dynamic features from the PE file behavioural report produced by the Cuckoo Sandbox. The PE files’ header details such as optional header, disk operating system header, and file header are treated as static features. The LSVC is used as a feature selector to choose prominent static and dynamic features from their respective Original Feature Space. The features recommended by the LSVC are highly discriminative and used as final features for the classification process. Different sets of experiments were conducted using real-world malware samples to verify the combination of static and dynamic features, which encourage the classifier to attain high accuracy. The tenfold cross-validation experimental results demonstrate that the proposed HF-MDS is proficient in precisely detecting malware and benign PE files by attaining detection accuracy of 99.743% with sequential minimal optimization classifier consisting of hybrid features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://code.google.com/p/pefile/.

  2. http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html.

  3. http://www.cs.waikato.ac.nz/ml/weka/.

  4. http://download.cnet.com/.

  5. http://www.onlinedown.net/.

  6. https://virusshare.com/.

  7. http://dasmalwerk.eu/.

  8. https://www.virustotal.com/.

References

  1. Alam, S., Traore, I., Sogukpinar, I.: Annotated control flow graph for metamorphic malware detection. Comput. J. 58(10), 2608–2621 (2014)

    Article  Google Scholar 

  2. Allix, K., Bissyandé, T.F., Jérome, Q., Klein, J., Le Traon, Y.: Empirical assessment of machine learning-based malware detectors for android. Empir. Softw. Eng. 21(1), 183–211 (2016)

    Article  Google Scholar 

  3. Amin, M.: A survey of financial losses due to malware. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, p. 145. ACM (2016)

  4. Awan, S., Saqib, N.A.: Detection of malicious executables using static and dynamic features of portable executable (pe) file. In: Security, Privacy and Anonymity in Computation, Communication and Storage: SpaCCS 2016 International Workshops, TrustData, TSP, NOPE, DependSys, BigDataSPT, and WCSSC, Zhangjiajie, China, November 16–18, 2016, Proceedings 9, pp. 48–58. Springer, New York (2016)

  5. Bai, J., Wang, J., Zou, G.: A malware detection scheme based on mining format information. Sci. World J. 2014 (2014)

  6. Bayer, U., Moser, A., Kruegel, C., Kirda, E.: Dynamic analysis of malicious code. J. Comput. Virol. 2(1), 67–77 (2006)

    Article  Google Scholar 

  7. Bazrafshan, Z., Hashemi, H., Fard, S.M.H., Hamzeh, A.: A survey on heuristic malware detection techniques. In: Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120. IEEE (2013)

  8. Belaoued, M., Mazouzi, S.: A real-time pe-malware detection system based on chi-square test and pe-file features. In: IFIP International Conference on Computer Science and its Applications __, pp. 416–425. Springer, New York (2015)

  9. Bounouh, T., Brahimi, Z., Al-Nemrat, A., Benzaid, C.: A scalable malware classification based on integrated static and dynamic features. In: International Conference on Global Security, Safety, and Sustainability, pp. 113–124. Springer, New York (2017)

  10. Calleja, A., Tapiador, J., Caballero, J.: A look into 30 years of malware development from a software metrics perspective. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 325–345. Springer, New York (2016)

  11. Das, S., Liu, Y., Zhang, W., Chandramohan, M.: Semantics-based online malware detection: towards efficient real-time protection against malware. IEEE Trans. Inf. Forensics Secur. 11(2), 289–302 (2016)

    Article  Google Scholar 

  12. Firdausi, I., Erwin, A., Nugroho, A.S., et al.: Analysis of machine learning techniques used in behavior-based malware detection. In: Advances in Computing, Control and Telecommunication Technologies (ACT), 2010 Second International Conference on, pp. 201–203. IEEE (2010)

  13. Gandotra, E., Bansal, D., Sofat, S.: Integrated framework for classification of malwares. In: Proceedings of the 7th International Conference on Security of Information and Networks, p. 417. ACM (2014)

  14. Guarnieri, C., Tanasi, A., Bremer, J., Schloesser, M.: Automated malware analysis-cuckoo sandbox (2012)

  15. Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)

    Article  Google Scholar 

  16. Kawaguchi, N., Omote, K.: Malware function classification using APIs in initial behavior. In: Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, pp. 138–144. IEEE (2015)

  17. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  18. Kumar, A., Kuppusamy, K., Aghila, G.: A learning model to detect maliciousness of portable executable using integrated feature set. J. King Saud Univ.-Comput. Inf. Sci. (2017)

  19. Lengyel, T.K., Maresca, S., Payne, B.D., Webster, G.D., Vogl, S., Kiayias, A.: Scalability, fidelity and stealth in the drakvuf dynamic malware analysis system. In: Proceedings of the 30th Annual Computer Security Applications Conference, pp. 386–395. ACM (2014)

  20. Masud, M.M., Khan, L., Thuraisingham, B.: A scalable multi-level feature extraction technique to detect malicious executables. Inf. Syst. Front. 10(1), 33–45 (2008)

    Article  Google Scholar 

  21. Miller, C., Glendowne, D., Cook, H., Thomas, D., Lanclos, C., Pape, P.: Insights gained from constructing a large scale dynamic analysis platform. Digit. Investig. 22, S48–S56 (2017)

    Article  Google Scholar 

  22. Mohaisen, A., Alrawi, O., Mohaisen, M.: Amal: high-fidelity, behavior-based automated malware analysis and classification. Comput. Secur. 52, 251–266 (2015)

    Article  Google Scholar 

  23. Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Computer Security Applications Conference, 2007. ACSAC 2007. Twenty-Third Annual, pp. 421–430. IEEE (2007)

  24. Noor, M., Abbas, H., Shahid, W.B.: Countering cyber threats for industrial applications: an automated approach for malware evasion detection and analysis. J. Netw. Comput. Appl. (2017)

  25. Pekalska, E., Duin, R.P.: Classifiers for dissimilarity-based pattern recognition. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on, vol. 2, pp. 12–16. IEEE (2000)

  26. Pektaş, A., Eriş, M., Acarman, T.: Proposal of n-gram based algorithm for malware classification. In: The Fifth International Conference on Emerging Security Information, Systems and Technologies, pp. 7–13 (2011)

  27. Qiao, Y., Yang, Y., He, J., Tang, C., Liu, Z.: CBM: free, automatic malware analysis framework using API call sequences. In: Knowledge Engineering and Management, pp. 225–236. Springer, New York (2014)

  28. Reddy, D.K.S., Pujari, A.K.: N-gram analysis for computer virus detection. J. Comput. Virol. 2(3), 231–239 (2006)

    Article  Google Scholar 

  29. Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)

    Article  Google Scholar 

  30. Saleh, M., Li, T., Xu, S.: Multi-context features for detecting malicious programs. J. Comput. Virol. Hacking Tech., pp. 1–13 (2017)

  31. Salehi, Z., Sami, A., Ghiasi, M.: Maar: Robust features to detect malicious activity based on api calls, their arguments and return values. Eng. Appl. Artif. Intell. 59, 93–102 (2017)

    Article  Google Scholar 

  32. Santos, I., Brezo, F., Nieves, J., Penya, Y.K., Sanz, B., Laorden, C., Bringas, P.G.: Idea: Opcode-sequence-based malware detection. In: International Symposium on Engineering Secure Software and Systems, pp. 35–43. Springer, New York (2010)

  33. Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)

    Article  MathSciNet  Google Scholar 

  34. Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static-dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE 12-SOCO 12 Special Sessions, pp. 271–280. Springer, New York (2013)

  35. Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: DCAI, pp. 415–422. Springer, New York (2011)

  36. Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, pp. 38–49. IEEE (2001)

  37. Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)

    Article  Google Scholar 

  38. Shahzad, R.K., Haider, S.I., Lavesson, N.: Detection of spyware by mining executable files. In: Availability, Reliability, and Security, 2010. ARES’10 International Conference on, pp. 295–302. IEEE (2010)

  39. Sharma, A., Sahay, S.K.: Evolution and detection of polymorphic and metamorphic malwares: a survey. arXiv preprint arXiv:1406.7061 (2014)

  40. Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Procedia Comput. Sci. 46, 804–811 (2015)

    Article  Google Scholar 

  41. Touchette, F.: The evolution of malware. Netw. Secur. 2016(1), 11–14 (2016)

    Article  Google Scholar 

  42. Vinod, P., Laxmi, V., Gaur, M.S.: Scattered feature space for malware analysis. In: Advances in Computing and Communications, pp. 562–571 (2011)

  43. Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using cwsandbox. IEEE Security and Privacy 5(2) (2007)

  44. Ye, Y., Chen, L., Wang, D., Li, T., Jiang, Q., Zhao, M.: SBMDS: an interpretable string based malware detection system using svm ensemble with bagging. J. Comput. Virol. 5(4), 283–293 (2009)

    Article  Google Scholar 

  45. Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surv. 50(3), 41 (2017)

    Article  Google Scholar 

  46. Ye, Y., Li, T., Chen, Y., Jiang, Q.: Automatic malware categorization using cluster ensemble. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95–104. ACM (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. L. Shiva Darshan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shiva Darshan, S.L., Jaidhar, C.D. Windows malware detection system based on LSVC recommended hybrid features. J Comput Virol Hack Tech 15, 127–146 (2019). https://doi.org/10.1007/s11416-018-0327-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-018-0327-9

Keywords

Navigation