Creating a Map of User Data in NTFS to Improve File Carving

  • Martin KarresandEmail author
  • Asalena Warnqvist
  • David Lindahl
  • Stefan Axelsson
  • Geir Olav  Dyrkolbotn
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 569)


Digital forensics and, especially, file carving are burdened by the large amounts of data that need to be processed. Attempts to solve this problem include efficient carving algorithms, parallel processing in the cloud and data reduction by filtering uninteresting files. This research addresses the problem by searching for data where it is more likely to be found. This is accomplished by creating a probability map for finding unique data at various logical block addressing positions in storage media. SHA-1 hashes of 512 B sectors are used to represent the data. The results, which are based on a collection of 30 NTFS partitions from computers running Microsoft Windows 7 and later versions, reveal that the mean probability of finding unique hash values at different logical block addressing positions vary between 12% to 41% in an NTFS partition. The probability map can be used by a forensic analyst to prioritize relevant areas in storage media without the need for a working filesystem. It can also be used to increase the efficiency of hash-based carving by dynamically changing the random sampling frequency. The approach contributes to digital forensic processes by enabling them to focus on interesting regions in storage media, increasing the probability of obtaining relevant results faster.


File carving hash-based carving partition content map NTFS 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    I. Ahmed, K. Lhee, H. Shin and M. Hong, On improving the accuracy and performance of content-based file type identification, Proceedings of the Fourteenth Australasian Conference on Information Security and Privacy, pp. 44–59, 2009.Google Scholar
  2. 2.
    S. Axelsson, The normalized compression distance as a file fragment classifier, Digital Investigation, vol. 7(S), pp. S24–S31, 2010.Google Scholar
  3. 3.
    S. Axelsson, Using normalized compression distance for classifying file fragments, Proceedings of the International Conference on Availability, Reliability and Security, pp. 641–646, 2010.Google Scholar
  4. 4.
    S. Axelsson, K. Bajwa and M. Srikanth, File fragment analysis using normalized compression distance, in Advances in Digital Forensics IX, G. Peterson and S. Shenoi (Eds.), Springer, Berlin Heidelberg, Germany, pp. 171–182, 2013.Google Scholar
  5. 5.
    J. Barbara, Solid state drives: Part 5, Forensic Magazine, vol. 11(1), pp. 30–31, 2014.Google Scholar
  6. 6.
    F. Breitinger and K. Petrov, Reducing the time required for hashing operations, in Advances in Digital Forensics IX, G. Peterson and S. Shenoi (Eds.), Springer, Berlin Heidelberg, Germany, pp. 101–117, 2013.Google Scholar
  7. 7.
    F. Breitinger, C. Rathgeb and H. Baier, An efficient similarity digests database lookup – A logarithmic divide and conquer approach, Journal of Digital Forensics, Security and Law, vol. 9(2), pp. 155–166, 2014.Google Scholar
  8. 8.
    F. Breitinger, G. Stivaktakis and H. Baier, FRASH: A framework to test algorithms of similarity hashing, Digital Investigation, vol. 10(S), pp. S50–S58, 2013.Google Scholar
  9. 9.
    C. Buckel, Understanding Flash: Blocks, Pages and Program Erases, flashdba Blog (, June 20, 2014.
  10. 10.
    C. Buckel, Understanding Flash: The Flash Translation Layer, flashdba Blog (, September 17, 2014.
  11. 11.
    W. Calhoun and D. Coles, Predicting the types of file fragments, Digital Investigation, vol. 5(S), pp. S14–S20, 2008.Google Scholar
  12. 12.
    B. Carrier, File System Forensic Analysis, Pearson Education, Upper Saddle River, New Jersey, 2005.Google Scholar
  13. 13.
    B. Carrier, TSK Tool Overview (, January 13, 2014.
  14. 14.
    T. Chung, D. Park, S. Park, D. Lee, S. Lee and H. Song, A survey of the flash translation layer, Journal of Systems Architecture, vol. 55(5-6), pp. 332–343, 2009.Google Scholar
  15. 15.
    S. Collange, Y. Dandass, M. Daumas and D. Defour, Using graphics processors for parallelizing hash-based data carving, Proceedings of the Forty-Second Hawaii International Conference on System Sciences, 2009.Google Scholar
  16. 16.
    Cryptology Group at Centrum Wiskunde and Informatica and Security, Privacy and Anti-Abuse Group at Google Research, SHAttered - We have Broken SHA-1 in Practice (, 2017.
  17. 17.
    Y. Dandass, N. Necaise and S. Thomas, An empirical analysis of disk sector hashes for data carving, Journal of Digital Forensic Practice, vol. 2(2), pp. 95–104, 2008.Google Scholar
  18. 18.
    Digital Corpora, Real Data Corpus (, July 15, 2018.
  19. 19.
    EUROPOL: European Law Enforcement Agency, IOCTA 2016: Internet Organized Crime Threat Assessment, Technical Report, European Police Office, The Hague, The Netherlands, 2016.Google Scholar
  20. 20.
    K. Fairbanks, An analysis of Ext4 for digital forensics, Digital Investigation, vol. 9(S), pp. S118–S130, 2012.Google Scholar
  21. 21.
    K. Fairbanks, A technique for measuring data persistence using the Ext4 file system journal, Proceedings of the Thirty-Ninth Annual IEEE Computer Software and Applications Conference, vol. 3, pp. 18–23, 2015.Google Scholar
  22. 22.
    K. Fairbanks and S. Garfinkel, Column: Factors affecting data decay, Journal of Digital Forensics, Security and Law, vol. 7(2), pp. 7–10, 2012.Google Scholar
  23. 23.
    S. Fitzgerald, G. Mathews, C. Morris and O. Zhulyn, Using NLP techniques for file fragment classification, Digital Investigation, vol. 9(S), pp. S44–S49, 2012.Google Scholar
  24. 24.
    K. Foster, Using Distinct Sectors in Media Sampling and Full Media Analysis to Detect Presence of Documents from a Corpus, Master’s Thesis, Department of Computer Science, Naval Postgraduate School, Monterey, California, 2012.Google Scholar
  25. 25.
    S. Garfinkel and M. McCarrin, Hash-based carving: Searching media for complete files and file fragments with sector hashing and hashdb, Digital Investigation, vol. 14(S1), pp. S95–S105, 2015.Google Scholar
  26. 26.
    S. Garfinkel, A. Nelson, D. White and V. Roussev, Using purpose-built functions and block hashes to enable small block and sub-file forensics, Digital Investigation, vol. 7(S), pp. S13–S23, 2010.Google Scholar
  27. 27.
    S. Gibbs, From Windows 1 to Windows 10: 29 years of Windows evolution, The Guardian, October 2, 2014.Google Scholar
  28. 28.
    P. Gladyshev and J. James, Decision-theoretic file carving, Digital Investigation, vol. 22, pp. 46–61, 2017.Google Scholar
  29. 29.
    Y. Gubanov and O. Afonin, Why SSD drives destroy court evidence and what can be done about it, Forensic Focus, October 23, 2012.Google Scholar
  30. 30.
    Y. Gubanov and O. Afonin, Recovering evidence from SSD drives in 2014: Understanding trim, garbage collection and exclusions, Forensic Focus, September 23, 2014.Google Scholar
  31. 31.
    Y. Gubanov and O. Afonin, SSD and eMMC forensics 2016, Forensic Focus, April 20, 2016.Google Scholar
  32. 32.
    Y. Gubanov and O. Afonin, SSD and eMMC forensics 2016 – Part 2, Forensic Focus, May 4, 2016.Google Scholar
  33. 33.
    Y. Gubanov and O. Afonin, SSD and eMMC forensics 2016 – Part 3, Forensic Focus, June 7, 2016.Google Scholar
  34. 34.
    Guidance Software, File Block Hash Map Analysis, Version 8.8.5, Waterloo, Canada (, 2018.
  35. 35.
    J. Jones, T. Khan, K. Laskey, A. Nelson, M. Laamanen and D. White, Inferring previously uninstalled applications from residual partial artifacts, Proceedings of the Eleventh Annual Conference on Digital Forensics, Security and Law, pp. 113–130, 2016.Google Scholar
  36. 36.
    M. Karresand, Completing the Picture – Fragments and Back Again, Licentiate Thesis, Institute of Technology: Faculty of Science and Engineering, Linkoping University, Linkoping, Sweden, 2008.Google Scholar
  37. 37.
    M. Karresand and N. Shahmehri, File type identification of data fragments by their binary structure, Proceedings of the Seventh Annual IEEE SMC Information Assurance Workshop, pp. 140–147, 2006.Google Scholar
  38. 38.
    M. Karresand and N. Shahmehri, Oscar – File type and camera identification using the structure of binary data fragments, Proceedings of the First Conference on Advances in Computer Security and Forensics, pp. 11–20, 2006.Google Scholar
  39. 39.
    M. Karresand and N. Shahmehri, Oscar – File type identification of binary data in disk clusters and RAM pages, Proceedings of the Thirty-First IFIP TC-11 International Information Security Conference, pp. 413–424, 2006.Google Scholar
  40. 40.
    M. Karresand and N. Shahmehri, Oscar – Using byte pairs to find the file type and camera make of data fragments, Proceedings of the Second European Conference on Computer Network Defense, pp. 85–94, 2007.Google Scholar
  41. 41.
    M. Karresand and N. Shahmehri, Reassembly of fragmented JPEG images containing restart markers, Proceedings of the Fourth European Conference on Computer Network Defense, pp. 25–32, 2008.Google Scholar
  42. 42.
    J. Kornblum, Identifying almost identical files using context triggered piecewise hashing, Digital Investigation, vol. 3(S), pp. S91–S97, 2006.Google Scholar
  43. 43.
    Q. Li, A. Ong, P. Suganthan and V. Thing, A novel support vector machine approach to high entropy data fragment classification, Proceedings of the South African Information Security Multi-Conference, pp. 236–247, 2010.Google Scholar
  44. 44.
    LSoft Technologies, NTFS Partition Boot Sector, Mississauga, Canada (, 2018.
  45. 45.
    Microsoft, Windows 7 System Requirements, Redmond, Washington (, April 12, 2017.
  46. 46.
    Microsoft, Windows 8.1 System Requirements, Redmond, Washington (, April 12, 2017.
  47. 47.
    Microsoft, Windows 10 System Requirements, Redmond, Washington (, November 20, 2017.
  48. 48.
    Microsoft, Default Cluster Size for NTFS, FAT and exFAT, Redmond, Washington (–fat–and-exfat), April 17, 2018.
  49. 49.
    Microsoft, How NTFS Works, Redmond, Washington (, October 28, 2018.
  50. 50.
    Net Applications, Desktop Operating System Market Share, Aliso Viejo, California (, 2017.
  51. 51.
    A. Pal and N. Memon, The evolution of file carving, IEEE Signal Processing, vol. 26(2), pp. 59–71, 2009.Google Scholar
  52. 52.
    R. Poisel, M. Rybnicek and S. Tjoa, Taxonomy of data fragment classification techniques, in Digital Forensics and Cyber Crime, P. Gladyshev, A. Marrington and I. Baggili (Eds.), Springer, Cham, Switzerland, pp. 67–85, 2014.Google Scholar
  53. 53.
    R. Poisel and S. Tjoa, A comprehensive literature review of file carving, Proceedings of the International Conference on Availability, Reliability and Security, pp. 475–484, 2013.Google Scholar
  54. 54.
    D. Quick and K. Choo, Data reduction and data mining framework for digital forensic evidence: Storage, intelligence, review and archive, Trends and Issues in Crime and Criminal Justice, no. 480, pp. 1–11, September 2014.Google Scholar
  55. 55.
    D. Quick and K. Choo, Impacts of increasing volume of digital forensic data: A survey and future research challenges, Digital Investigation, vol. 11(4), pp. 273–294, 2014.Google Scholar
  56. 56.
    D. Quick and K. Choo, Big forensic data reduction: Digital forensic images and electronic evidence, Cluster Computing, vol. 19(2), pp. 723–740, 2016.Google Scholar
  57. 57.
    R. Reiter, T. Swatosh, P. Hempstead and M. Hicken, Accessing logical-to-physical address translation data for solid state disks, U.S. Patent No. 8898371, November 25, 2014.Google Scholar
  58. 58.
    V. Roussev, Managing terabyte-scale investigations with similarity digests, in Advances in Digital Forensics VIII, G. Peterson and S. Shenoi (Eds.), Springer, Berlin Heidelberg, Germany, pp. 19–34, 2012.Google Scholar
  59. 59.
    N. Rowe, Identifying forensically uninteresting files using a large corpus, in Digital Forensics and Cyber Crime, P. Gladyshev, A. Marrington and I. Baggili (Eds.), Springer, Cham, Switzerland, pp. 86–101, 2014.Google Scholar
  60. 60.
    B. Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C, John Wiley and Sons, Hoboken, New Jersey, 1996.Google Scholar
  61. 61.
    M. Stevens, E. Bursztein, P. Karpman, A. Albertini and Y. Markov, The first collision for full SHA-1, Proceedings of the Thirty-Seventh Annual International Cryptology Conference, pp. 570–596, 2017.Google Scholar
  62. 62.
    A. Tridgell, spamsum (, July 27, 2015.
  63. 63.
    R. van Baar, H. van Beek and E. van Eijk, Digital forensics as a service: A game changer, Digital Investigation, vol. 11(S1), pp. S54–S62, 2014.Google Scholar
  64. 64.
    H. van Beek, E. van Eijk, R. van Baar, M. Ugen, J. Bodde and A. Siemelink, Digital forensics as a service: Game on, Digital Investigation, vol. 15, pp. 20–38, 2015.Google Scholar
  65. 65.
    C. Veenman, Statistical disk cluster classification for file carving, Proceedings of the Third International Symposium on Information Assurance and Security, pp. 393–398, 2007.Google Scholar
  66. 66.
    J. Young, K. Foster, S. Garfinkel and K. Fairbanks, Distinct sector hashes for target file detection, IEEE Computer, vol. 45(12), pp. 28–35, 2012.Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  • Martin Karresand
    • 1
    Email author
  • Asalena Warnqvist
    • 2
  • David Lindahl
    • 3
  • Stefan Axelsson
    • 1
  • Geir Olav  Dyrkolbotn
    • 1
  1. 1.Norwegian University of Science and TechnologyGjovikNorway
  2. 2.National Forensic Centre, Swedish Police AuthorityLinkopingSweden
  3. 3.Swedish Defence Research AgencyLinkopingSweden

Personalised recommendations