Leveraging String Kernels for Malware Detection

  • Jonas Pfoh
  • Christian Schneider
  • Claudia Eckert
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7873)


Signature-based malware detection will always be a step behind as novel malware cannot be detected. On the other hand, machine learning-based methods are capable of detecting novel malware but classification is frequently done in an offline or batched manner and is often associated with time overheads that make it impractical. We propose an approach that bridges this gap. This approach makes use of a support vector machine (SVM) to classify system call traces. In contrast to other methods that use system call traces for malware detection, our approach makes use of a string kernel to make better use of the sequential information inherent in a system call trace. By classifying system call traces in small sections and keeping a moving average over the probability estimates produced by the SVM, our approach is capable of detecting malicious behavior online and achieves great accuracy.


Security Machine Learning Malware Detection System Calls 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. Technical report, Berlin Institute of Technology (2009)Google Scholar
  2. 2.
    Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. Journal of Machine Learning Research 7, 2721–2744 (2006)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Xiao, H., Stibor, T.: A supervised topic transition model for detecting malicious system call sequences. In: Proceedings of the Workshop on Knowledge Discovery, Modeling and Simulation. ACM, New York (2011)Google Scholar
  5. 5.
    Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 38–49. IEEE, Washington, DC (2001)Google Scholar
  6. 6.
    Wagner, D., Dean, D.: Intrusion detection via static analysis. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 156–168. IEEE, Washington, DC (2001)Google Scholar
  7. 7.
    Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the ACM Conference on Computer and Communications Security, pp. 255–264. ACM, New York (2002)Google Scholar
  8. 8.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)zbMATHGoogle Scholar
  9. 9.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  10. 10.
    Platt, J.C.: Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)Google Scholar
  11. 11.
    Pfoh, J., Schneider, C., Eckert, C.: Nitro: Hardware-based system call tracing for virtual machines. In: Iwata, T., Nishigaki, M. (eds.) IWSEC 2011. LNCS, vol. 7038, pp. 96–112. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), Software available at
  13. 13.
    Herbrich, R.: Learning Kernel Classifiers: Theory and Algorithms. MIT Press, Cambridge (2001)Google Scholar
  14. 14.
    Lin, H.T., Lin, C.J., Weng, R.C.: A note on platt’s probabilistic outputs for support vector machines. Machine Learning 68(3), 267–276 (2007)CrossRefGoogle Scholar
  15. 15.
    Liao, Y., Vemuri, V.R.: Using text categorization techniques for intrusion detection. In: Proceedings of the USENIX Security Symposium, pp. 51–59. USENIX, Berkeley (2002)Google Scholar
  16. 16.
    Wang, X., Yu, W., Champion, A., Fu, X., Xuan, D.: Detecting worms via mining dynamic program execution. In: Proceedings of the International Conference on Security and Privacy in Communications Networks (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jonas Pfoh
    • 1
  • Christian Schneider
    • 1
  • Claudia Eckert
    • 1
  1. 1.Computer Science DepartmentTechnische Universität MünchenMunichGermany

Personalised recommendations