Evasion Attacks Against Statistical Code Obfuscation Detectors

  • Jiawei SuEmail author
  • Danilo Vasconcellos Vargas
  • Kouichi Sakurai
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10418)


In the domain of information security, code obfuscation is a feature often employed for malicious purposes. For example there have been quite a few papers reporting that obfuscated JavaScript frequently comes with malicious functionality such as redirecting to external malicious websites. In order to capture such obfuscation, a class of detectors based on statistical features of code, mostly n-grams have been proposed and been claimed to achieve high detection accuracy. In this paper, we formalize a common scenario between defenders who maintain the statistical obfuscation detectors and adversaries who want to evade the detection. Accordingly, we create two kinds of evasion attack methods and evaluate the robustness of statistical detectors under such attacks. Experimental results show that statistical obfuscation detectors can be easily fooled by a sophisticated adversary even in worst case scenarios.


Obfuscated JavaScript Novelty detection Adversarial machine learning 



This research was partially supported by Collaboration Hubs for International Program (CHIRP) of SICORP, Japan Science and Technology Agency (JST). The authors would like to thank the referees and reviewers for their valuable comments and suggestions to improve the quality of the paper.


  1. 1.
    Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: 20th International Conference on World Wide Web, pp. 197–206. ACM(2011)Google Scholar
  2. 2.
    Wang, W., Lv, Y., Chen, H., Fang, Z.: A static malicious JavaScript detection using SVM. In: 2nd International Conference on Computer Science and Electronics Engineering, vol. 40, pp. 21–30 (2013)Google Scholar
  3. 3.
    Nishida, M., et al.: Obfuscated malicious JavaScript detection using machine learning with character frequency. In: Information Processing Society of Japan SIG Technical Report, No. 21 (2014)Google Scholar
  4. 4.
    Kamizono, M., et al.: Datasets for anti-malware research - MWS datasets 2013. In: Anti Malware Engineering WorkShop (2013)Google Scholar
  5. 5.
    Laskov, P., Srndic, N.: Static detection of malicious JavaScript-bearing PDF documents. In: 27th Annual Computer Security Applications Conference, pp. 373–382. ACM (2011)Google Scholar
  6. 6.
    Al-Taharwa, I.A., et al.: Obfuscated malicious JavaScript detection by Causal Relations Finding. In: 2011 13th International Conference Advanced Communication Technology (ICACT), pp. 787–792. IEEE (2011)Google Scholar
  7. 7.
    Kim, B., Im, C., Jung, H.: Suspicious malicious web site detection with strength analysis of a JavaScript obfuscation. Int. J. Adv. Sci. Technol. 26, 19–32 (2011)Google Scholar
  8. 8.
    Choi, Y., Kim, T., Choi, S.: Automatic detection for JavaScript obfuscation attacks in web pages through string pattern analysis. Int. J. Secur. Appl. 4(2), 13–26 (2010)Google Scholar
  9. 9.
    Scholkopf, B., Williamson, R., Smola, A., Taylor, J., Platt, J.: Support vector method for novelty detection. In: Solla, S.A., Leen, T.K., Muller, K.-R. (eds.), pp. 582–588. MIT Press (2000)Google Scholar
  10. 10.
  11. 11.
  12. 12.
    Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. In: Information Security Technical Report, vol. 14, pp. 16–29. Elsevier (2009)Google Scholar
  13. 13.
    Su, J., Yoshioka, K., Shikata, J., Matsumoto, T.: Detecting obfuscated suspicious JavaScript based on information-theoretic measures and novelty detection. In: Kwon, S., Yun, A. (eds.) ICISC 2015. LNCS, vol. 9558, pp. 278–293. Springer, Cham (2016). doi: 10.1007/978-3-319-30840-1_18 CrossRefGoogle Scholar
  14. 14.
    Daniel, L., Meek, C.: Good word attacks on statistical spam filter. In: CEAS (2015)Google Scholar
  15. 15.
    Visaggio, C., Canfora, G.: An empirical study of metric-based methods to detect obfuscated code. Int. J. Secur. Appl. 7(2) (2013)Google Scholar
  16. 16.
    Huang, L., et al.: Adversarial machine learning. In: 4th ACM Workshop on Artificial Intelligence and Security, pp. 43–58 (2011)Google Scholar
  17. 17.
    Kruskal, J.B.: Multidimensional Scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1-27 (1964)Google Scholar
  18. 18.
    Scholkopf, B., et al.: Support vector method for novelty detection. In: Conference on Neural Information Processing Systems 1999 (NIPS 1999), vol. 12, pp. 582–588 (1999)Google Scholar
  19. 19.
    Tax, D.M.J., Duin, R.P.W.: Support vector data description. J. Mach. Learn. 54(1), 45–66 (2004)Google Scholar
  20. 20.
    Curtsinger, C., et al.: ZOZZLE: Fast and precise in-browser JavaScript malware detection. In: USENIX Security Symposium, pp. 33–48 (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jiawei Su
    • 1
    Email author
  • Danilo Vasconcellos Vargas
    • 1
  • Kouichi Sakurai
    • 1
  1. 1.Kyushu UniversityFukuokaJapan

Personalised recommendations