Abstract
Handling forensic investigations gets more and more difficult as the amount of data one has to analyze is increasing continuously. A common approach for automated file identification are hash functions. The proceeding is quite simple: a tool hashes all files of a seized device and compares them against a database. Depending on the database, this allows to discard non-relevant (whitelisting) or detect suspicious files (blacklisting).
One can distinguish three kinds of algorithms: (cryptographic) hash functions, bytewise approximate matching and semantic approximate matching (a.k.a perceptual hashing) where the main difference is the operation level. The latter one operates on the semantic level while both other approaches consider the byte-level. Hence, investigators have three different approaches at hand to analyze a device.
First, this paper gives a comprehensive overview of existing approaches for bytewise and semantic approximate matching (for semantic we focus on images functions). Second, we compare implementations and summarize the strengths and weaknesses of all approaches. Third, we show how to integrate these functions based on a sample use case into one existing process model, the computer forensics field triage process model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Well-known synonyms are fuzzy hashing and similarity hashing.
- 2.
Well-known synonyms are perceptual hashing and robust hashing.
- 3.
http://ssdeep.sourceforge.net; visited 2013-Aug-20.
- 4.
http://roussev.net/sdhash/sdhash.html; visited 2013-Aug-20.
- 5.
http://phash.org; visited 2013-Aug-20.
- 6.
http://perkeo.com; visited 2013-Aug-20.
- 7.
Live response.
References
Pollitt, M.M.: An ad hoc review of digital forensic models. In: Second International Workshop on Systematic Approaches to Digital Forensic Engineering, SADFE 2007, pp. 43–54 (2007)
Rogers, M.K., Goldman, J., Mislan, R., Wedge, T., Debrota, S.: Computer forensics field triage process model. In: Conference on Digital Forensics, Security and Law, pp. 27–40 (2006)
NIST: National Software Reference Library, May 2012. http://www.nsrl.nist.gov
NIST: Secure Hash Standard. National Institute of Standards and Technologies, FIPS PUB 180–1 (1995)
White, D.: Hashing of file blocks: When exact matches are not useful. Presentation at American Academy of Forensic Sciences (AAFS) (2008)
Baier, H., Dichtelmueller, C.: Datenreduktion mittels kryptographischer Hashfunktionen in der IT-Forensik: Nur ein Mythos? In: DACH Security 2012, pp. 278–287, September 2012
Breitinger, F., Baier, H.: A Fuzzy Hashing Approach based on Random Sequences and Hamming Distance. In: ADFSL Conference on Digital Forensics, Security and Law, pp. 89–101, May 2012
Breitinger, F., Åstebøl, K.P., Baier, H., Busch, C.: mvhash-b - a new approach for similarity preserving hashing. IT Security Incident Management & IT Forensics (IMF), vol. 7, March 2013
Sadowski, C., Levin, G.: Simhash: Hash-based similarity detection, December 2007. http://simhash.googlecode.com/svn/trunk/paper/SimHashWithBib.pdf
Broder, A.Z.: On the resemblance and containment of documents. In: Compression and Complexity of Sequences (SEQUENCES’97), pp. 21–29. IEEE Computer Society (1997)
Tridgell, A.: Spamsum, Readme (2002). http://samba.org/ftp/unpacked/junkcode/spamsum/README
Noll, L.C.: Fowler/Noll/Vo (FNV) Hash (2001). http://www.isthe.com/chongo/tech/comp/fnv/index.html
Roussev, V.: Data fingerprinting with similarity digests. Int. Fed. Inf. Process. 337(2010), 207–226 (2010)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 422–426 (1970)
Breitinger, F., Baier, H.: Similarity Preserving Hashing: Eligible Properties and a new Algorithm MRSH-v2. In: 4th ICST Conference on Digital Forensics & Cyber Crime (ICDF2C), October 2012
Roussev, V., Richard, G.G., Marziale, L.: Multi-resolution similarity hashing. Digital Forensic Research Workshop (DFRWS), pp. 105–113 (2007)
Kato, T.: Database architecture for content-based image retrieval. In: Image Storage and Retrieval Systems. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose. California, 9–14 February, vol. 1662, pp. 112–123, April 1992
Eakins, J., Graham, M.: Content-based image retrieval. University of Northumbria at Newcastle, JTAP report 39, October 1999
MPEG: Information technology - multimedia content description interface - part 3: Visual. ISO/IEC, Technical Report 15938–3 (2002)
Grega, M., Bryk, D., Napora, M.: INACT–INDECT advanced image cataloguing tool. Multimedia Tools and Applications, July 2012
Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)
Stricker, M., Orengo, M.: Similarity of color images. In: Storage and Retrieval for Image and Video Databases III. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose, California, 5–10 February, vol. 2420, pp. 381–392, March 1995
Xiang, S., Kim, H.J.: Histogram-based image hashing for searching content-preserving copies. In: Shi, Y.Q., Emmanuel, S., Kankanhalli, M.S., Chang, S.-F., Radhakrishnan, R., Ma, F., Zhao, L. (eds.) Transactions on DHMS VI. LNCS, vol. 6730, pp. 83–108. Springer, Heidelberg (2011)
Fridrich, J.: Robust bit extraction from images. In: IEEE International Conference on Multimedia Computing and Systems, vol. 2, pp. 536–540. IEEE Computer Society (1999)
Venkatesan, R., Koon, S.-M., Jakubowski, M.H., Moulin, P.: Robust image hashing. In: 2000 International Conference on Image Processing, vol. 3, pp. 664–666. IEEE (2000)
Yang, B., Gu, F., Niu, X.: Block mean value based image perceptual hashing. In: Intelligent Information Hiding and Multimedia Multimedia Signal Processing. IEEE Computer Society (2006)
Steinebach, M.: Robust hashing for efficient forensic analysis of image sets. In: Gladyshev, P., Rogers, M.K. (eds.) ICDF2C 2011. LNICST, vol. 88, pp. 180–187. Springer, Heidelberg (2012)
Queluz, M.P.: Towards robust, content based techniques for image authentication. In: Multimedia Signal Processing, pp. 297–302. IEEE (1998)
Xie, L., Arce, G.R.: A class of authentication digital watermarks for secure multimedia communication. IEEE Trans. Image Process. 10(11), 1754–1764 (2001)
Lefèbvre, F., Macq, B., Legat, J.-D.: Rash: radon soft hash algorithm. In: EUSIPCO’2002, vol. 1. TéSA, pp. 299–302 (2002)
Stanaert, F.-X., Lefèbvre, F., Rouvroy, G., Macq, B., Quisquater, J.-J., Legat, J.-D.: Practical evaluation of a radial soft hash algorithm. In: ITCC, vol. 2, pp. 89–94. IEEE Computer Society (2005)
De Roover, C., De Vleeschouwer, C., Lefèbvre, F., Macq, B.: Robust image hashing based on radial variance of pixels. In: ICIP, vol. 3, pp. 77–80. IEEE (2005)
Bhattacharjee, S., Kutter, M.: Compression tolerant image authentication. In: 1998 International Conference on Image Processing, vol. 1, pp. 435–439. IEEE Computer Society (1998)
Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15(11), 3453–3466 (2006)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, no. 2, pp. 1150–1157. IEEE Computer Society (1999)
Lv, X., Wang, Z.J.: Perceptual image hashing based on shape contexts and local feature points. IEEE Trans. Inf. Foren. Sec. 7(3), 1081–1093 (2012)
Steinebach, M., Liu, H., Yannikos, Y.: Forbild: Efficient robust image hashing. In: SPIE 8303. Security, and Forensics, Media Watermarking (2012)
Zauner, C.: Implementation and benchmarking of perceptual image hash functions, Master’s thesis, University of Applied Sciences Upper Austria, July 2010
Zauner, C., Steinebach, M., Hermann, E.: Rihamark: perceptual image hash benchmarking. In: Media Watermarking, Security, and Forensics III. Proc. SPIE, IS&T/SPIE Electronic Imaging, San Francisco, California, 23–27 January, vol. 7880, pp. 7880 0X-1-15, Feb 2011. http://dx.doi.org/10.1117/12.876617
Breitinger, F., Stivaktakis, G., Baier, H.: FRASH: a framework to test algorithms of similarity hashing. In: 13th Digital Forensics Research Conference (DFRWS’13), Monterey, August 2013
Weng, L., Preneel, B.: From image hashing to video hashing. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, Y.-P.P. (eds.) MMM 2010. LNCS, vol. 5916, pp. 662–668. Springer, Heidelberg (2010)
Winter, C., Schneider, M., Yannikos, Y.: F2S2: fast forensic similarity search through indexing piecewise hash signatures. http://www.anwendertag-forensik.de/content/dam/anwendertag-forensik/de/documents/2012/Vortrag_Winter.pdf
Giraldo Triana, O.A.: Fast similarity search for robust image hashes, Bachelor Thesis, Technische Universität Darmstadt (2012)
Roussev, V.: An evaluation of forensic similarity hashes. In: Digital Forensic Research Workshop, vol. 8, pp. 34–41 (2011)
Acknowledgments
This work is supported by CASED (Center for Advanced Security Research Darmstadt).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Breitinger, F., Liu, H., Winter, C., Baier, H., Rybalchenko, A., Steinebach, M. (2014). Towards a Process Model for Hash Functions in Digital Forensics. In: Gladyshev, P., Marrington, A., Baggili, I. (eds) Digital Forensics and Cyber Crime. ICDF2C 2013. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 132. Springer, Cham. https://doi.org/10.1007/978-3-319-14289-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-14289-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14288-3
Online ISBN: 978-3-319-14289-0
eBook Packages: Computer ScienceComputer Science (R0)