Abstract
It is important for malware analysis that comparing unknown files to previously-known malicious samples to quickly characterize the type of behavior and generate signatures. Malware writers often use obfuscation, such as packing, junk-insertion and other means of techniques to thwart traditional similarity comparison methods. In this paper, we introduce DepSim, a novel technique for finding dependency similarities between malicious binary programs. DepSim constructs dependency graphs of control flow and data flow of the program by taint analysis, and then conducts similarity analysis using a new graph isomorphism technique. In order to promote the accuracy and anti-interference capability, we reduce redundant loops and remove junk actions at the dependency graph pre-processing phase, which can also greatly improve the performance of our comparison algorithm. We implemented a prototype of DepSim and evaluated it to malware in the wild. Our prototype system successfully identified some semantic similarities between malware and revealed their inner similarity in program logic and behavior. The results demonstrate that our technique is accurate.
Supported by the National Natural Science Foundation of China under Grant No. 60703076, 61073179; the National High-Tech Research and Development Plan of China under Grant No. 2007AA01Z451, 2009AA01Z435.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gao, D., Reiter, M., Song, D.: Binhunt: Automatically Finding Semantic Differences in Binary Programs. In: Proceedings of the International Conference on Information and Communications Security, pp. 238–255 (2008)
Wang, Z., Pierce, K., McFarling, S.: BMAT – a binary matching tool for stale profile propagation. The Journal of Instruction-Level Parallelism 2 (May 2000)
Microsoft Security Intelligence Report (January through June 2009), http://www.microsoft.com/downloads/details.aspx?FamilyID=037f3771-330e-4457-a52c-5b085dc0a4cd&displaylang=en
Bayer, U., Kruegel, C., Kirda, E.: TTAnalyze: A Tool for Analyzing Malware. In: Proc. of the 15th European Institute for Computer Antivirus Research Annual Conference (April 2006)
Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, Virginia, USA, Alexandria, October 28-31 (2007)
Bellard, F.: QEMU, a fast and portable dynamic translator. In: In Proc. of the USENIX Annual Technical Conference, pp. 41–46 (April 2005)
Dullien, T., Rolles, R.: Graph-based comparison of executable objects. In: Proceedings of SSTIC 2005 (2005)
Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, August 22-25 (2004)
Bilar, D.: Statistical Structures: Tolerant Fingerprinting for Classification and Analysis given at BH 2006, Las Vegas, NV. Blackhat Briefings, USA (August 2006)
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Lee, T., Mody, J.J.: Behavioral classification (2006), http://www.microsoft.com/downloads/details.aspx?FamilyID=7b5d8cc8-b336-4091-abb5-2cc500a6c41a&displaylang=en
Bayer, U., Comparetti, P.M., Hlauscheck, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: Network and Distributed System Security Symposium, NDSS (2009)
Baker, B.S., Manber, U.: Deducing Similarities in Java Sources from Bytecodes, pp. 179–190 (1998)
Sreedhar, V.C., Gao, G.R., Lee, Y.-F.: Identifying loops using DJ graphs. ACM Transactions on Programming Languages and Systems (TOPLAS) 18(6), 649–658 (1996)
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, Dubrovnik, Croatia, September 03-07 (2007)
VXHeavens, http://www.netlux.org
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium (1998)
Udis86, http://udis86.sourceforge.net/
Christodorescu, M., Kinder, J., Jha, S., Katzenbeisser, S., Veith, H.: Malware normalization. Technical Report 1539, University of Wisconsin, Madison, Wisconsin, USA (November 2005)
Walenstein, A., Venable, M., Hayes, M., Thompson, C., Lakhotia, A.: Exploiting similarity between variants to defeat malware: “Vilo” method for comparing and searching binary programs. In: Proceedings of BlackHat, DC 2007 (2007)
Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of NDSS 2005, San Diego, California, USA (February 2005)
Willems, C., Holz, T., Freiling, F.: CWSandbox: Towards automated dynamic binary analysis. IEEE Security and Privacy 5(2) (2007)
Anubis: Analyzing Unknown Binaries, http://anubis.iseclab.org/
Norman SandBox, http://www.norman.com/enterprise/all_products/malware_analyzer/norman_sandbox_analyzer/no
NetSky Wiki, http://en.wikipedia.org/wiki/Netsky_%28computer_worm%29
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium (1998)
Jordan, M.: Dealing with metamorphism. Virus Bulletin, 4–6 (October 2002)
Zhuge, J., Holz, T., Han, X., Guo, J., Zou, W.: Characterizing the IRC-based Botnet Phenomenon, Reihe Informatik Technical Report TR-2007-010 (December 2007)
Lingyun, Y., Purui, S., Dengguo, F., Xianggen, W., Yi, Y., Yu, L.: ReconBin: Reconstructing Binary File from Execution for Software Analysis. In: Proceedings of the 2009 Third IEEE International Conference on Secure Software Integration and Reliability Improvement, pp. 222–229 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yi, Y., Lingyun, Y., Rui, W., Purui, S., Dengguo, F. (2011). DepSim: A Dependency-Based Malware Similarity Comparison System. In: Lai, X., Yung, M., Lin, D. (eds) Information Security and Cryptology. Inscrypt 2010. Lecture Notes in Computer Science, vol 6584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21518-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-21518-6_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21517-9
Online ISBN: 978-3-642-21518-6
eBook Packages: Computer ScienceComputer Science (R0)