Abstract
JavaScript is a browser scripting language initially created to enhance the interactivity of web sites and to improve their user-friendliness. However, as it offloads the work to the user’s browser, it can be used to engage in malicious activities such as Crypto Mining, Drive-by Download attacks, or redirections to web sites hosting malicious software. Given the prevalence of such nefarious scripts, the anti-virus industry has increased the focus on their detection. The attackers, in turn, make increasing use of obfuscation techniques, so as to hinder analysis and the creation of corresponding signatures. Yet these malicious samples share syntactic similarities at an abstract level, which enables to bypass obfuscation and detect even unknown malware variants.
In this paper, we present JaSt, a low-overhead solution that combines the extraction of features from the abstract syntax tree with a random forest classifier to detect malicious JavaScript instances. It is based on a frequency analysis of specific patterns, which are either predictive of benign or of malicious samples. Even though the analysis is entirely static, it yields a high detection accuracy of almost 99.5% and has a low false-negative rate of 0.54%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Malware don’t need Coffee, https://malware.dontneedcoffee.com.
- 2.
Alexa top sites, http://www.alexa.com/topsites.
References
Atom: Atom the hackable text editor for the 21st Century. https://atom.io. Accessed 21 Feb 2018
Backes, M., Nauman, M.: LUNA: quantifying and leveraging uncertainty in android malware analysis through Bayesian machine learning. In: Euro S&P (2017)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: International Conference on World Wide Web (2011)
Cao, Y., Pan, X., Chen, Y., Zhuge, J.: JShield: towards real-time and vulnerability-based detection of polluted drive-by download attacks. In: Annual Computer Security Applications Conference (ACSAC) (2014)
Curtsinger, C., Livshits, B., Zorn, B., Seifert, C.: Zozzle: fast and precise in-browser javascript malware detection. In: USENIX (2011)
Gastwirth, J.L.: The estimation of the Lorenz curve and Gini index. Rev. Econ. Stat. 54, 306–316 (1972)
Hao, Y., Liang, H., Zhang, D., Zhao, Q., Cui, B.: JavaScript malicious codes analysis based on naive Bayes classification. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (2014)
Hidayat, A.: ECMAScript Parsing Infrastructure for Multipurpose Analysis. http://esprima.org. Accessed 05 Apr 2017
AV-TEST - The Independent IT-Security Institute: New malware. https://www.av-test.org/en/statistics/malware. Accessed 01 Feb 2018
Invernizzi, L., Benvenuti, S., Cova, M., Comparetti, P.M., Kruegel, C., Vigna, G.: EvilSeed: a guided approach to finding malicious web pages. In: S&P (2012)
Joseph, A.D., Laskov, P., Roli, F., Tygar, J.D., Nelson, B.: Machine learning methods for computer security. In: Dagstuhl Manifestos (2013)
Jules, D.S.: JS inspect Detect copy-pasted and structurally similar code. https://github.com/danielstjules/jsinspect. Accessed 19 Feb 2018
Kantchelian, A., Tygar, J.D., Joseph, A.D.: Evasion and hardening of tree ensemble classifiers. In: International Conference on Machine Learning (2016)
Kaplan, S., Livshits, B., Zorn, B., Siefert, C., Curtsinger, C.: “NoFus: Automatically Detecting” + String.fromCharCode(32) + “ObFuSCateD ”. toLowerCase() + “JavaScript Code”. Microsoft Research Technical Report (2011)
Kapravelos, A., Shoshitaishvili, Y., Cova, M., Krügel, C., Vigna, G..: Revolver: an automated approach to the detection of evasive web-based malware. In: USENIX (2013)
Kar, D., Panigrahi, S., Sundararajan, S.: SQLiGot: detecting SQL injections attacks using graph of tokens and SVM. Comput. Secur. 60, 206–225 (2016)
Kolbitsch, C., Livshits, B., Zorn, B., Seifert, C.: Rozzle: de-cloaking internet malware. In: S&P (2012)
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Laskov, P., Šrndić, N.: Static detection of malicious javascript-bearing pdf documents. In: Annual Computer Security Applications Conference (ACSAC) (2011)
Likarish, P., Jung, E., Jo, I.: Obfuscated malicious javascript detection using classification techniques. In: International Conference on Malicious and Unwanted Software (MALWARE) (2009)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 37–63 (2011)
Rao, V., Hande, K.: A comparative study of static, dynamic and hybrid analysis techniques for android malware detection. Int. J. Eng. Dev. Res. (IJEDR) 5, 1433–1436 (2017)
Symantec Security Response: Mirai: what you need to know about the botnet behind recent major DDoS attacks. https://www.symantec.com/connect/blogs/mirai-what-you-need-know-about-botnet-behind-recent-major-ddos-attacks. Accessed 02 Feb 2018
Symantec Security Response: Petya ransomware outbreak: Here is what you need to know. https://www.symantec.com/blogs/threat-intelligence/petya-ransomware-wiper. Accessed 14 Feb 2018
Symantec Security Response: What you need to know about the WannaCry Ransomware. https://www.symantec.com/blogs/threat-intelligence/wannacry-ransomware-attack. Accessed 14 Feb 2018
Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: Annual Computer Security Applications Conference (ACSAC) (2010)
Stock, B., Livshits, B., Zorn, B.: Kizzle: a signature compiler for detecting exploit kits. In: Dependable Systems and Networks (DSN) (2016)
Šrndić, N., Laskov, P.: Detection of malicious pdf files based on hierarchical document structure. In: NDSS (2013)
Wang, K., Parekh, J.J., Stolfo, S.J.: Anagram: a content anomaly detector resistant to mimicry attack. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 226–248. Springer, Heidelberg (2006). https://doi.org/10.1007/11856214_12
Wisse, W., Veenman, C.J.: Scripting DNA: identifying the javascript programmer. Digit. Investig. 15, 61–71 (2015)
Wressnegger, C., Schwenk, G., Arp, D., Rieck, K.: A close look on n-grams in intrusion detection: anomaly detection vs. classification. In: ACM Workshop on Artificial Intelligence and Security (AISec) (2013)
Xu, W., Zhang, F., Zhu, S.: The power of obfuscation techniques in malicious javascript code: a measurement study. In: International Conference on Malicious and Unwanted Software (MALWARE) (2012)
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers: a case study on pdf malware classifiers. In: NDSS (2016)
Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. In: Annual Computer Security Applications Conference (ACSAC) (2012)
Youden, W.J.: Index for rating diagnostic tests. Cancer 3, 32–35 (1950)
Acknowledgments
This work would not have been possible without the help of the German Federal Office for Information Security and Kafeine DNC which provided us with materials for our experiments. We would also like to thank the anonymous reviewers of this paper for their well-appreciated feedback. This work was partially supported by the German Federal Ministry of Education and Research (BMBF) through funding for the Center for IT-Security, Privacy and Accountability (CISPA) (FKZ: 16KIS0345).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Fass, A., Krawczyk, R.P., Backes, M., Stock, B. (2018). JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In: Giuffrida, C., Bardin, S., Blanc, G. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2018. Lecture Notes in Computer Science(), vol 10885. Springer, Cham. https://doi.org/10.1007/978-3-319-93411-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-93411-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93410-5
Online ISBN: 978-3-319-93411-2
eBook Packages: Computer ScienceComputer Science (R0)