Skip to main content

Towards a Process Model for Hash Functions in Digital Forensics

  • Conference paper
  • First Online:
Book cover Digital Forensics and Cyber Crime (ICDF2C 2013)

Abstract

Handling forensic investigations gets more and more difficult as the amount of data one has to analyze is increasing continuously. A common approach for automated file identification are hash functions. The proceeding is quite simple: a tool hashes all files of a seized device and compares them against a database. Depending on the database, this allows to discard non-relevant (whitelisting) or detect suspicious files (blacklisting).

One can distinguish three kinds of algorithms: (cryptographic) hash functions, bytewise approximate matching and semantic approximate matching (a.k.a perceptual hashing) where the main difference is the operation level. The latter one operates on the semantic level while both other approaches consider the byte-level. Hence, investigators have three different approaches at hand to analyze a device.

First, this paper gives a comprehensive overview of existing approaches for bytewise and semantic approximate matching (for semantic we focus on images functions). Second, we compare implementations and summarize the strengths and weaknesses of all approaches. Third, we show how to integrate these functions based on a sample use case into one existing process model, the computer forensics field triage process model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Well-known synonyms are fuzzy hashing and similarity hashing.

  2. 2.

    Well-known synonyms are perceptual hashing and robust hashing.

  3. 3.

    http://ssdeep.sourceforge.net; visited 2013-Aug-20.

  4. 4.

    http://roussev.net/sdhash/sdhash.html; visited 2013-Aug-20.

  5. 5.

    http://phash.org; visited 2013-Aug-20.

  6. 6.

    http://perkeo.com; visited 2013-Aug-20.

  7. 7.

    Live response.

References

  1. Pollitt, M.M.: An ad hoc review of digital forensic models. In: Second International Workshop on Systematic Approaches to Digital Forensic Engineering, SADFE 2007, pp. 43–54 (2007)

    Google Scholar 

  2. Rogers, M.K., Goldman, J., Mislan, R., Wedge, T., Debrota, S.: Computer forensics field triage process model. In: Conference on Digital Forensics, Security and Law, pp. 27–40 (2006)

    Google Scholar 

  3. NIST: National Software Reference Library, May 2012. http://www.nsrl.nist.gov

  4. NIST: Secure Hash Standard. National Institute of Standards and Technologies, FIPS PUB 180–1 (1995)

    Google Scholar 

  5. White, D.: Hashing of file blocks: When exact matches are not useful. Presentation at American Academy of Forensic Sciences (AAFS) (2008)

    Google Scholar 

  6. Baier, H., Dichtelmueller, C.: Datenreduktion mittels kryptographischer Hashfunktionen in der IT-Forensik: Nur ein Mythos? In: DACH Security 2012, pp. 278–287, September 2012

    Google Scholar 

  7. Breitinger, F., Baier, H.: A Fuzzy Hashing Approach based on Random Sequences and Hamming Distance. In: ADFSL Conference on Digital Forensics, Security and Law, pp. 89–101, May 2012

    Google Scholar 

  8. Breitinger, F., Åstebøl, K.P., Baier, H., Busch, C.: mvhash-b - a new approach for similarity preserving hashing. IT Security Incident Management & IT Forensics (IMF), vol. 7, March 2013

    Google Scholar 

  9. Sadowski, C., Levin, G.: Simhash: Hash-based similarity detection, December 2007. http://simhash.googlecode.com/svn/trunk/paper/SimHashWithBib.pdf

  10. Broder, A.Z.: On the resemblance and containment of documents. In: Compression and Complexity of Sequences (SEQUENCES’97), pp. 21–29. IEEE Computer Society (1997)

    Google Scholar 

  11. Tridgell, A.: Spamsum, Readme (2002). http://samba.org/ftp/unpacked/junkcode/spamsum/README

  12. Noll, L.C.: Fowler/Noll/Vo (FNV) Hash (2001). http://www.isthe.com/chongo/tech/comp/fnv/index.html

  13. Roussev, V.: Data fingerprinting with similarity digests. Int. Fed. Inf. Process. 337(2010), 207–226 (2010)

    Google Scholar 

  14. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 422–426 (1970)

    Article  MATH  Google Scholar 

  15. Breitinger, F., Baier, H.: Similarity Preserving Hashing: Eligible Properties and a new Algorithm MRSH-v2. In: 4th ICST Conference on Digital Forensics & Cyber Crime (ICDF2C), October 2012

    Google Scholar 

  16. Roussev, V., Richard, G.G., Marziale, L.: Multi-resolution similarity hashing. Digital Forensic Research Workshop (DFRWS), pp. 105–113 (2007)

    Google Scholar 

  17. Kato, T.: Database architecture for content-based image retrieval. In: Image Storage and Retrieval Systems. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose. California, 9–14 February, vol. 1662, pp. 112–123, April 1992

    Google Scholar 

  18. Eakins, J., Graham, M.: Content-based image retrieval. University of Northumbria at Newcastle, JTAP report 39, October 1999

    Google Scholar 

  19. MPEG: Information technology - multimedia content description interface - part 3: Visual. ISO/IEC, Technical Report 15938–3 (2002)

    Google Scholar 

  20. Grega, M., Bryk, D., Napora, M.: INACT–INDECT advanced image cataloguing tool. Multimedia Tools and Applications, July 2012

    Google Scholar 

  21. Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)

    Article  Google Scholar 

  22. Stricker, M., Orengo, M.: Similarity of color images. In: Storage and Retrieval for Image and Video Databases III. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose, California, 5–10 February, vol. 2420, pp. 381–392, March 1995

    Google Scholar 

  23. Xiang, S., Kim, H.J.: Histogram-based image hashing for searching content-preserving copies. In: Shi, Y.Q., Emmanuel, S., Kankanhalli, M.S., Chang, S.-F., Radhakrishnan, R., Ma, F., Zhao, L. (eds.) Transactions on DHMS VI. LNCS, vol. 6730, pp. 83–108. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Fridrich, J.: Robust bit extraction from images. In: IEEE International Conference on Multimedia Computing and Systems, vol. 2, pp. 536–540. IEEE Computer Society (1999)

    Google Scholar 

  25. Venkatesan, R., Koon, S.-M., Jakubowski, M.H., Moulin, P.: Robust image hashing. In: 2000 International Conference on Image Processing, vol. 3, pp. 664–666. IEEE (2000)

    Google Scholar 

  26. Yang, B., Gu, F., Niu, X.: Block mean value based image perceptual hashing. In: Intelligent Information Hiding and Multimedia Multimedia Signal Processing. IEEE Computer Society (2006)

    Google Scholar 

  27. Steinebach, M.: Robust hashing for efficient forensic analysis of image sets. In: Gladyshev, P., Rogers, M.K. (eds.) ICDF2C 2011. LNICST, vol. 88, pp. 180–187. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  28. Queluz, M.P.: Towards robust, content based techniques for image authentication. In: Multimedia Signal Processing, pp. 297–302. IEEE (1998)

    Google Scholar 

  29. Xie, L., Arce, G.R.: A class of authentication digital watermarks for secure multimedia communication. IEEE Trans. Image Process. 10(11), 1754–1764 (2001)

    Article  MATH  Google Scholar 

  30. Lefèbvre, F., Macq, B., Legat, J.-D.: Rash: radon soft hash algorithm. In: EUSIPCO’2002, vol. 1. TéSA, pp. 299–302 (2002)

    Google Scholar 

  31. Stanaert, F.-X., Lefèbvre, F., Rouvroy, G., Macq, B., Quisquater, J.-J., Legat, J.-D.: Practical evaluation of a radial soft hash algorithm. In: ITCC, vol. 2, pp. 89–94. IEEE Computer Society (2005)

    Google Scholar 

  32. De Roover, C., De Vleeschouwer, C., Lefèbvre, F., Macq, B.: Robust image hashing based on radial variance of pixels. In: ICIP, vol. 3, pp. 77–80. IEEE (2005)

    Google Scholar 

  33. Bhattacharjee, S., Kutter, M.: Compression tolerant image authentication. In: 1998 International Conference on Image Processing, vol. 1, pp. 435–439. IEEE Computer Society (1998)

    Google Scholar 

  34. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15(11), 3453–3466 (2006)

    Article  Google Scholar 

  35. Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, no. 2, pp. 1150–1157. IEEE Computer Society (1999)

    Google Scholar 

  36. Lv, X., Wang, Z.J.: Perceptual image hashing based on shape contexts and local feature points. IEEE Trans. Inf. Foren. Sec. 7(3), 1081–1093 (2012)

    Article  Google Scholar 

  37. Steinebach, M., Liu, H., Yannikos, Y.: Forbild: Efficient robust image hashing. In: SPIE 8303. Security, and Forensics, Media Watermarking (2012)

    Google Scholar 

  38. Zauner, C.: Implementation and benchmarking of perceptual image hash functions, Master’s thesis, University of Applied Sciences Upper Austria, July 2010

    Google Scholar 

  39. Zauner, C., Steinebach, M., Hermann, E.: Rihamark: perceptual image hash benchmarking. In: Media Watermarking, Security, and Forensics III. Proc. SPIE, IS&T/SPIE Electronic Imaging, San Francisco, California, 23–27 January, vol. 7880, pp. 7880 0X-1-15, Feb 2011. http://dx.doi.org/10.1117/12.876617

  40. Breitinger, F., Stivaktakis, G., Baier, H.: FRASH: a framework to test algorithms of similarity hashing. In: 13th Digital Forensics Research Conference (DFRWS’13), Monterey, August 2013

    Google Scholar 

  41. Weng, L., Preneel, B.: From image hashing to video hashing. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, Y.-P.P. (eds.) MMM 2010. LNCS, vol. 5916, pp. 662–668. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  42. Winter, C., Schneider, M., Yannikos, Y.: F2S2: fast forensic similarity search through indexing piecewise hash signatures. http://www.anwendertag-forensik.de/content/dam/anwendertag-forensik/de/documents/2012/Vortrag_Winter.pdf

  43. Giraldo Triana, O.A.: Fast similarity search for robust image hashes, Bachelor Thesis, Technische Universität Darmstadt (2012)

    Google Scholar 

  44. Roussev, V.: An evaluation of forensic similarity hashes. In: Digital Forensic Research Workshop, vol. 8, pp. 34–41 (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by CASED (Center for Advanced Security Research Darmstadt).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Breitinger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Breitinger, F., Liu, H., Winter, C., Baier, H., Rybalchenko, A., Steinebach, M. (2014). Towards a Process Model for Hash Functions in Digital Forensics. In: Gladyshev, P., Marrington, A., Baggili, I. (eds) Digital Forensics and Cyber Crime. ICDF2C 2013. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 132. Springer, Cham. https://doi.org/10.1007/978-3-319-14289-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14289-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14288-3

  • Online ISBN: 978-3-319-14289-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics