Using Randomization to Attack Similarity Digests

Oliver, Jonathan; Forman, Scott; Cheng, Chun

doi:10.1007/978-3-662-45670-5_19

Jonathan Oliver¹⁶,
Scott Forman¹⁶ &
Chun Cheng¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 490))

Included in the following conference series:

International Conference on Applications and Techniques in Information Security

1553 Accesses
13 Citations

Abstract

There has been considerable research and use of similarity digests and Locality Sensitive Hashing (LSH) schemes - those hashing schemes where small changes in a file result in small changes in the digest. These schemes are useful in security and forensic applications. We examine how well three similarity digest schemes (Ssdeep, Sdhash and TLSH) work when exposed to random change. Various file types are tested by randomly manipulating source code, Html, text and executable files. In addition, we test for similarities in modified image files that were generated by cybercriminals to defeat fuzzy hashing schemes (spam images). The experiments expose shortcomings in the Sdhash and Ssdeep schemes that can be exploited in straight forward ways. The results suggest that the TLSH scheme is more robust to the attacks and random changes considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnett, B.: Sed - An Introduction and Tutorial, http://www.grymoire.com/Unix/Sed.html
Breitinger, F.: Sicherheitsaspekte von fuzzy-hashing. Master’s thesis, Hochschule Darmstadt (2011)
Google Scholar
Breitinger, F., Baier, H., Beckingham, J.: Security and Implementation Analysis of the Similarity Digest sdhash. In: 1st International Baltic Conference on Network Security & Forensics (NeSeFo), Tartu, Estland (2012)
Google Scholar
C4.5 source code, http://www.rulequest.com/Personal/
Hosmer, C.: Metamorphic and Polymorphic Malware, Black Hat USA (2008), http://blackhat.com/presentations/bh-usa-08/Hosmer/BH_US_08_Hosmer_Polymorphic_Malware.pdf
Kornblum, J.: Identifying Almost Identical Files Using Context Triggered Piecewise Hashing. In: Proceedings of the 6th Annual DFRWS, pp. S91–S97. Elsevier (2006)
Google Scholar
Oliver, J., Cheng, C., Chen, Y.: TLSH - A Locality Sensitive Hash. In: 4th Cybercrime and Trustworthy Computing Workshop, Sydney (November 2013), https://www.academia.edu/7833902/TLSH_-A_Locality_Sensitive_Hash
Roussev, V.: An Evaluation of Forensics Similarity Hashes. In: Proceedings of the 11th Annual DFRWS, pp. S34–S41. Elsevier (2011)
Google Scholar
Roussev, V.: Data Fingerprinting with Similarity Digests. In: Chow, K., Shenoi, S. (eds.) Advances in Digital Forensics VI. IFIP AICT, vol. 337, pp. 207–226. Springer, Heidelberg (2010)
Chapter Google Scholar
CxImage, http://www.codeproject.com/Articles/1300/CxImage
Nilsimsa source code, http://ixazon.dynip.com/~cmeclax/nilsimsa.html
NIST, http://www.nsrl.nist.gov/ssdeep.htm
Stackoverflow Blog, White space inside XML/HTML tags, http://stackoverflow.com/questions/3314535/white-space-inside-xml-html-tags
SVMlight source code, http://svmlight.joachims.org/
TLSH source code, https://github.com/trendmicro/tlsh
Virus Total, http://www.virustotal.org/

Download references

Author information

Authors and Affiliations

Trend Micro, Melbourne, Australia
Jonathan Oliver, Scott Forman & Chun Cheng

Authors

Jonathan Oliver
View author publications
You can also search for this author in PubMed Google Scholar
Scott Forman
View author publications
You can also search for this author in PubMed Google Scholar
Chun Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Deakin University, Melbourne, Australia
Lynn Batten
School of Information Technology, Deakin University, Australia
Gang Li
Institute of Information Engineering, Chinese Academy of Sciences, China
Wenjia Niu
School of Information and Business Analysis, Melbourne Camput at Burwood, Deakin University, 221, Burwood Highway, 315, Burwood, VIC, Australia
Matthew Warren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oliver, J., Forman, S., Cheng, C. (2014). Using Randomization to Attack Similarity Digests. In: Batten, L., Li, G., Niu, W., Warren, M. (eds) Applications and Techniques in Information Security. ATIS 2014. Communications in Computer and Information Science, vol 490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45670-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-662-45670-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45669-9
Online ISBN: 978-3-662-45670-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics