Abstract
In this paper we introduce an efficient application for malicious PDF detection: ADEPT. With targeted attacks rising over the recent past, exploring a new detection and mitigation paradigm becomes mandatory. The use of malicious PDF files that exploit vulnerabilities in well-known PDF readers has become a popular vector for targeted attacks, for which few efficient approaches exist. Although simple in theory, parsing followed by analysis of such files is resource-intensive and may even be impossible due to several obfuscation and reader-specific artifacts. Our paper describes a new approach for detecting such malicious payloads that leverages machine learning techniques and an efficient feature selection mechanism for rapidly detecting anomalies. We assess our approach on a large selection of malicious files and report the experimental performance results for the developed prototype.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
References
Adobe: PDF reference sixth edition, adobe portable document format, version 1.7 (2006)
Filiol, E., Blonce, A., Frayssignes, L.: Portable document format (PDF) security analysis and malware threats. J. Comput. Virol. 3(2), 75–86 (2007)
Daniel, M., Honoroff, J., Miller, C.: Engineering heap overflow exploits with JavaScript. In: Proceedings of the 2nd Conference on USENIX Workshop on Offensive Technologies, WOOT’08, pp. 1:1–1:6. USENIX Association, Berkeley (2008)
Rahman, M.A.: Getting owned by malicious PDF - analysis. Global Information Assurance Certification Paper (2010)
Laskov, P., Šrndić, N.: Static detection of malicious JavaScript-bearing PDF documents. In: Proceedings of the 27th Annual Computer Security Applications Conference. ACSAC ’11, pp. 373–382. ACM, New York (2011)
Šrndic, N., Laskov, P.: Detection of malicious pdf files based on hierarchical document structure. In: Proceedings of the 20th Annual Network and Distributed System Security Symposium (2013)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Witten, I., Frank, E., Hall, M.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Amsterdam (2011)
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Lowagie, B.: iText in Action: Creating and Manipulating PDF. Dreamtech Press, New Delhi (2006)
Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using CWSandbox. IEEE Secur. Priv. 5, 32–39 (2007)
Trinius, P., Willems, C., Holz, T., Rieck, K.: A malware instruction set for behavior-based analysis. In: Proceedings of the Conference Sicherheit Schutz und Zuverlssigkeit SICHERHEIT (TR-2009-07), pp. 1–11 (2011)
Tzermias, Z., Sykiotakis, G., Polychronakis, M., Markatos, E.P.: Combining static and dynamic analysis for the detection of malicious documents. In: Proceedings of the Fourth European Workshop on System Security. EUROSEC ’11, pp. 4:1–4:6. ACM, New York (2011)
Schmitt, F., Gassen, J., Gerhards-Padilla, E.: Pdf scrutinizer: detecting javascript-based attacks in pdf documents. In: 2012 Tenth Annual International Conference on Privacy, Security and Trust (PST), pp. 104–111. IEEE(2012)
Rieck, K., Krueger, T., Dewald, A.: Cujo: Efficient detection and prevention of drive-by-download attacks. In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 31–39. ACM (2010)
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 239–248. ACM (2012)
François, J., Wang, S., State, R., Engel, T.: BotTrack: tracking botnets using NetFlow and PageRank. In: Domingo-Pascual, J., Manzoni, P., Palazzo, S., Pont, A., Scoglio, C. (eds.) NETWORKING 2011, Part I. LNCS, vol. 6640, pp. 1–14. Springer, Heidelberg (2011)
Wagner, C., Wagener, G., State, R., Engel, T.: Malware analysis with graph kernels and support vector machines. In: 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), pp. 63–68. IEEE (2009)
Abdelnur, H.J., State, R., Festor, O.: Advanced network fingerprinting. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 372–389. Springer, Heidelberg (2008)
Kolter, J., Maloof, M.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Li, W., Wang, K., Stolfo, S., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop. IAW’05, pp. 64–71. IEEE (2005)
Stolfo, S.J., Wang, K., Li, W.J.: Fileprint analysis for malware detection. ACM CCS WORM (2005)
Li, W., Stolfo, S., Stavrou, A., Androulaki, E., Keromytis, A.: A study of malcode-bearing documents. Detection of Intrusions and Malware, and Vulnerability, Assessment, pp. 231–250 (2007)
Bayer, U., Moser, A., Kruegel, C., Kirda, E.: Dynamic analysis of malicious code. J. Comput. Virol. 1, 67–77 (2006)
Acknowledgment
The authors would like to thank Prof. Dr. Pavel Laskov for the support and dataset provided for our experiments. Special thanks also go to the VirusTotal team for giving us access to several datasets.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jerome, Q., Marchal, S., State, R., Engel, T. (2014). Advanced Detection Tool for PDF Threats. In: Garcia-Alfaro, J., Lioudakis, G., Cuppens-Boulahia, N., Foley, S., Fitzgerald, W. (eds) Data Privacy Management and Autonomous Spontaneous Security. DPM SETOP 2013 2013. Lecture Notes in Computer Science(), vol 8247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54568-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-54568-9_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54567-2
Online ISBN: 978-3-642-54568-9
eBook Packages: Computer ScienceComputer Science (R0)