Advertisement

Using Low-Level Dynamic Attributes for Malware Detection Based on Data Mining Methods

  • Dmitry Komashinskiy
  • Igor Kotenko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7531)

Abstract

The modern methodologies of computer threats’ detection traditionally include heuristic approaches of detecting malicious programs (malware) and their side effects. Usually these approaches are used in order to form some auxiliary classification and categorization systems which simplify procedures of processing previously unseen data sets and revealing previously non-obvious structural and behavioral dependencies for malware. Such systems have a number of issues caused by specificity of processes of their creation and functioning. One of such issues is looking for feature sets whose use increases accuracy of malware detection. The paper presents description and analysis of an approach focusing on this issue. It is based on instantiating a number of classifiers learned in a feature space representing low-level dynamic specificities of applications to be analyzed.

Keywords

malware detection data mining dynamic attributes 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data Mining Methods for Detection of New Malicious Executables. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 38–49 (2001)Google Scholar
  2. 2.
    McAfee Labs blog: A Look at One Day of Malware Samples (October 2011), http://blogs.mcafee.com/mcafee-labs/a-look-at-one-day-of-malware-samples
  3. 3.
    Wikipedia: Stuxnet computer worm, http://en.wikipedia.org/wiki/Stuxnet
  4. 4.
    Wikipedia: Flame computer malware, http://en.wikipedia.org/wiki/Flame_malware
  5. 5.
    Kephart, J.O., Sorkin, G.B., Arnold, W.C., Chess, D.M., Tesauro, G.J., White, S.R.: Biologically inspired defenses against computer viruses. In: Proceedings of 14th International Joint Conference on Artificial Intelligence, pp. 985–996 (1995)Google Scholar
  6. 6.
    Pietrek, M.: An In-Depth Look into the Win32 Portable Executable File Format. Microsoft Developers’ Magazine (February, 2002), http://msdn.microsoft.com/en-us/magazine/cc135800.aspx
  7. 7.
  8. 8.
    Kolter, J.Z., Maloof, M.A.: Learning to Detect Malicious Executables in the Wild. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478 (2004)Google Scholar
  9. 9.
    Masud, M.M., Khan, L.R., Thuraisingham, B.M.: Feature-Based Techniques for Auto-Detection of Novel Email Worms. In: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 205–216 (2007)Google Scholar
  10. 10.
    Masud, M.M., Khan, L.R., Thuraisingham, B.M.: A Hybrid Model to Detect Malicious Executables. In: Proceedings of the IEEE International Conference on Communication, pp. 1443–1448 (2007)Google Scholar
  11. 11.
    Masud, M.M., Khan, L.R., Thuraisingham, B.M.: A scalable multi-level feature-extraction technique to detect malicious executables. Information Systems Frontiers 10, 33–45 (2008)CrossRefGoogle Scholar
  12. 12.
    Menahem, E., Shabtai, A., Rokach, L., Elovici, Y.: Improving Malware Detection by Applying Multi-Inducer Ensemble. Journal of Computational Statistics & Data Analysis 53(4), 1483–1494 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Alazab, M., Layton, R., Venkataraman, S., Watters, P.: Malware Detection Based on Structural and Behavioural Features of API Calls. In: Proceedings of International Cyber Resilience Conference, pp. 1–10 (2010)Google Scholar
  14. 14.
    Santos, I., Penya, Y.K., Devesa, J., Bringas, P.G.: N-grams-based File Signatures for Malware Detection. In: Proceedings of the 11th International Conference on Enterprise Information Systems, pp. 317–320 (2009)Google Scholar
  15. 15.
    Lu, Y.-B., Din, S.-C., Zheng, C.-F., Gao, B.-J.: Using Multi-Feature and Classifier Ensembles to Improve Malware Detection. Journal of Chung Cheng Institute of Technology 39(2), 57–72 (2010)Google Scholar
  16. 16.
    Perdisci, R., Lanzi, A., Lee, W.: McBoost: Boosting scalability in malware collection and analysis using statistical classification of executables. In: Proceedings of the Computer Security Applications Conference, pp. 301–310 (2008)Google Scholar
  17. 17.
    Shahzad, F., Farooq, M.: ELF-Miner: Using Structural Knowledge and Data Mining Methods to Detect New (Linux) Malicious Executables. Journal of Knowledge and Information Systems 30(3), 589–612 (2012)CrossRefGoogle Scholar
  18. 18.
    Ye, Y., Li, T.: Automatic Malware Categorization Using Cluster Ensemble. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95–104 (2010)Google Scholar
  19. 19.
    Siddiqui, M., Wang, M., Lee, J.: Detecting Internet Worms Using Data Mining Techniques. Journal of Systemics, Cybernetics and Informatics 6(6), 48–53 (2008)Google Scholar
  20. 20.
    Kinable, J.: Malware Detection through Call Graphs. Publications of Future Internet (FI) Programme, Master’s Thesis. Aalto University, Department of Information and Computer Science (2010)Google Scholar
  21. 21.
    Komashinskiy, D.V., Kotenko, I.V.: Using Data Mining methods for malware detection. In: Information Fusion and Geographical Information Systems, pp. 343–359. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  22. 22.
    Ye, Y., Li, T., Huang, K., Jiang, Q., Chen, Y.: Hierarchical associative classifier (HAC) for malware detection from the large and imbalanced gray list. Journal of Intelligent Information Systems 35(1), 1–20 (2010)CrossRefGoogle Scholar
  23. 23.
    Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: AccessMiner: Using System-Centric Models for Malware Protection. In: Proceedings of the 17th ACM Conference on Computer and Communication Security, pp. 399–412 (2010)Google Scholar
  24. 24.
    Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic Analysis of Malware Behavior using Machine Learning. Journal of Computer Security 19(4), 639–668 (2011)Google Scholar
  25. 25.
    Shahzad, F., Bhatti, S., Shahzad, M., Farooq, M.: In-Execution Malware Detection using Task Structures of Linux Processes. In: Proceedings of the IEEE International Conference on Communications ICC 2011, pp. 1–6 (2011)Google Scholar
  26. 26.
    Intel Corporation: IA-32 Intel Architecture Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M. Intel Corporation (2006)Google Scholar
  27. 27.
    Intel Corporation: IA-32 Intel Architecture Software Developer’s Manual, Volume 2A: Instruction Set Reference, N-Z. Intel Corporation (2006)Google Scholar
  28. 28.
    Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 161-168 (2006)Google Scholar
  29. 29.
    F-Secure: Bifrose malware family description, http://www.f-secure.com/v-descs/backdoor_w32_bifrose_bge.shtml
  30. 30.
    Total Malware Info: Lmir malware family description, http://www.totalmalwareinfo.com/rus/Trojan-PSW.Win32.Lmir.ko
  31. 31.
    F-Secure: Magania malware family description, http://www.f-secure.com/v-descs/trojan-psw_w32_magania.shtml
  32. 32.
    F-Secure: OnlineGames malware family description, http://www.f-secure.com/v-descs/trojan-psw_w32_onlinegames.shtml
  33. 33.
  34. 34.
  35. 35.
    SourceForge: Find, Create and Publish Open Source software for free, http://sourceforge.net
  36. 36.
    Microsoft: Download and Install Debugging Tools for Windows, http://msdn.microsoft.com/en-us/windows/hardware/gg463009.aspx
  37. 37.
    GitHub: Open RCE, pydbg, a pure-python win32 debugger interface, https://github.com/OpenRCE/pydbg
  38. 38.
    IDA: Interactive disassembler and debugger, http://www.idapro.ru/
  39. 39.
    Harris, E.: Information Gain Versus Gain Ratio: A Study of Split Method Biases. In: Online Proceedings of 7th International Symposium on Artificial Intelligence and Mathematics (2002)Google Scholar
  40. 40.
  41. 41.
    Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/
  42. 42.
    Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta 405(2), 442–451 (1975)CrossRefGoogle Scholar
  43. 43.
    Ferrie, P.: The Ultimate Anti-Debugging Reference (May 2011), http://pferrie.host22.com/papers/antidebug.pdf

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Dmitry Komashinskiy
    • 1
  • Igor Kotenko
    • 1
  1. 1.St. Petersburg Institute for Informatics and Automation (SPIIRAS)St. PetersburgRussia

Personalised recommendations