Advertisement

Music classification as a new approach for malware detection

  • Mehrdad Farrokhmanesh
  • Ali HamzehEmail author
Original Paper
  • 119 Downloads

Abstract

Each year, a huge number of malicious programs are released which causes malware detection to become a critical task in computer security. Antiviruses use various methods for detecting malware, such as signature-based and heuristic-based techniques. Polymorphic and metamorphic malwares employ obfuscation techniques to bypass traditional detection methods used by antiviruses. Recently, the number of these malware has increased dramatically. Most of the previously proposed methods to detect malware are based on high-level features such as opcodes, function calls or program’s control flow graph (CFG). Due to new obfuscation techniques, extracting high-level features is tough, fallible and time-consuming; hence approaches using program’s bytes are quicker and more accurate. In this paper, a novel byte-level method for detecting malware by audio signal processing techniques is presented. In our proposed method, program’s bytes are converted to a meaningful audio signal, then Music Information Retrieval (MIR) techniques are employed to construct a machine learning music classification model from audio signals to detect new and unseen instances. Experiments evaluate the influence of different strategies converting bytes to audio signals and the effectiveness of the method.

Keywords

Malware detection Music Information Retrieval (MIR) Audio signal processing MFCC Classification 

References

  1. 1.
    Moir, R.: Defining Malware: FAQ. Microsoft TechNet. https://technet.microsoft.com/en-us/library/dd632948.aspx (2003). Accessed 17 Feb 2017
  2. 2.
    Symantec.: Internet Security Threat Report, Volume 17. Technical report, Symantec Corporation (2011). http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf. Accessed 19 May 2018
  3. 3.
    Vinod, P., Jaipur, R., Laxmi, V., Gaur, M.: Survey on malware detection methods. In: Proceedings of the 3rd Hackers’ Workshop on Computer and Internet Security (IITKHACK’09), pp. 74–79 (2009)Google Scholar
  4. 4.
    Wong, W.: Analysis and detection of metamorphic computer viruses. Department of Computer Science, San Jose State University, May, Master’s Thesis (2006)Google Scholar
  5. 5.
    Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.P.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. (Ny) 231, 64–82 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Typke, R., Wiering, F., Veltkamp, R.C.: A survey of music information retrieval systems. In: ISMIR, pp. 153–160 (2005)Google Scholar
  7. 7.
    Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)CrossRefGoogle Scholar
  8. 8.
    Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1(1), 19–22 (2010)MathSciNetGoogle Scholar
  9. 9.
    Zhou, Y., Inge, W.M.: Malware detection using adaptive data compression. In: Proceedings of the 1st ACM Workshop on Workshop on AISec, pp. 53–60 (2008)Google Scholar
  10. 10.
    Khorsand, Z., Hamzeh, A.: A novel compression-based approach for malware detection using PE header. In: 2013 5th Conference on IEEE Information and Knowledge Technology (IKT), pp. 127–133 (2013)Google Scholar
  11. 11.
    Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings. 2001 IEEE Symposium on Security and Privacy, 2001. S\(\backslash \)&P 2001, pp. 38–49 (2001)Google Scholar
  12. 12.
    Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7(Dec), 2721–2744 (2006)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B. S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, vol. 4 (2011)Google Scholar
  14. 14.
    Han, K.S., Lim, J.H., Kang, B., Im, E.G.: Malware analysis using visualized images and entropy graphs. Int. J. Inf. Secur. 14(1), 1–14 (2015)CrossRefGoogle Scholar
  15. 15.
    Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 21–30 (2011)Google Scholar
  16. 16.
    Hashemi, H., Azmoodeh, A., Hamzeh, A., Hashemi, S.: Graph embedding as a new approach for unknown malware detection. J. Comput. Virol. Hacking Tech. 13(3), 153–166 (2017)CrossRefGoogle Scholar
  17. 17.
    Yu, X., Zhang, J., Liu, J., Wan, W., Yang, W.: An audio retrieval method based on chromagram and distance metrics. In: 2010 International Conference on. IEEE Audio Language and Image Processing (ICALIP), pp. 425–428 (2010)Google Scholar
  18. 18.
    Harrington, P.: Machine Learning in Action, no. 3, vol. 37. Manning Publications Co., Greenwich, CT, USA (2012)Google Scholar
  19. 19.
    FluidSynth 2.0. http://www.fluidsynth.org/, Accessed 17 Feb 2017
  20. 20.
    Giannakopoulos, T.: pyAudioAnalysis: an open-source python library for audio signal analysis. PLoS ONE 10(12), 1–17 (2015)CrossRefGoogle Scholar
  21. 21.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  22. 22.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Vanderplas, J.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Microsoft Malware Classification Challenge (BIG 2015), Kaggle. https://www.kaggle.com/c/malware-classification. Accessed 17 Feb 2017
  24. 24.
    Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–39 (2011)MathSciNetGoogle Scholar
  25. 25.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 1995 International Joint Conference on Artificial Intelligence, vol. 14, no. 2, pp. 1137–1145 (1995)Google Scholar
  26. 26.
    Dodge, C., Jerse, T.A.: Computer music: synthesis, composition and performance. Macmillan Library Reference, Hampshire (1997)Google Scholar
  27. 27.
    Bello, J. P.: MIDI Code, NewYork University. https://www.nyu.edu/classes/bello/FMT_files/9_MIDI_code.pdf. Accessed 14 May 2018

Copyright information

© Springer-Verlag France SAS, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringShiraz UniversityShirazIran

Personalised recommendations