Abstract
There has been considerable research and use of similarity digests and Locality Sensitive Hashing (LSH) schemes - those hashing schemes where small changes in a file result in small changes in the digest. These schemes are useful in security and forensic applications. We examine how well three similarity digest schemes (Ssdeep, Sdhash and TLSH) work when exposed to random change. Various file types are tested by randomly manipulating source code, Html, text and executable files. In addition, we test for similarities in modified image files that were generated by cybercriminals to defeat fuzzy hashing schemes (spam images). The experiments expose shortcomings in the Sdhash and Ssdeep schemes that can be exploited in straight forward ways. The results suggest that the TLSH scheme is more robust to the attacks and random changes considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barnett, B.: Sed - An Introduction and Tutorial, http://www.grymoire.com/Unix/Sed.html
Breitinger, F.: Sicherheitsaspekte von fuzzy-hashing. Master’s thesis, Hochschule Darmstadt (2011)
Breitinger, F., Baier, H., Beckingham, J.: Security and Implementation Analysis of the Similarity Digest sdhash. In: 1st International Baltic Conference on Network Security & Forensics (NeSeFo), Tartu, Estland (2012)
C4.5 source code, http://www.rulequest.com/Personal/
Hosmer, C.: Metamorphic and Polymorphic Malware, Black Hat USA (2008), http://blackhat.com/presentations/bh-usa-08/Hosmer/BH_US_08_Hosmer_Polymorphic_Malware.pdf
Kornblum, J.: Identifying Almost Identical Files Using Context Triggered Piecewise Hashing. In: Proceedings of the 6th Annual DFRWS, pp. S91–S97. Elsevier (2006)
Oliver, J., Cheng, C., Chen, Y.: TLSH - A Locality Sensitive Hash. In: 4th Cybercrime and Trustworthy Computing Workshop, Sydney (November 2013), https://www.academia.edu/7833902/TLSH_-A_Locality_Sensitive_Hash
Roussev, V.: An Evaluation of Forensics Similarity Hashes. In: Proceedings of the 11th Annual DFRWS, pp. S34–S41. Elsevier (2011)
Roussev, V.: Data Fingerprinting with Similarity Digests. In: Chow, K., Shenoi, S. (eds.) Advances in Digital Forensics VI. IFIP AICT, vol. 337, pp. 207–226. Springer, Heidelberg (2010)
Nilsimsa source code, http://ixazon.dynip.com/~cmeclax/nilsimsa.html
Stackoverflow Blog, White space inside XML/HTML tags, http://stackoverflow.com/questions/3314535/white-space-inside-xml-html-tags
SVMlight source code, http://svmlight.joachims.org/
TLSH source code, https://github.com/trendmicro/tlsh
Virus Total, http://www.virustotal.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oliver, J., Forman, S., Cheng, C. (2014). Using Randomization to Attack Similarity Digests. In: Batten, L., Li, G., Niu, W., Warren, M. (eds) Applications and Techniques in Information Security. ATIS 2014. Communications in Computer and Information Science, vol 490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45670-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-662-45670-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45669-9
Online ISBN: 978-3-662-45670-5
eBook Packages: Computer ScienceComputer Science (R0)