Skip to main content

A Survey of Methods and Tools for Large-Scale DNA Mixture Profiling

  • Chapter
  • First Online:
Smart Infrastructure and Applications

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

Abstract

DNA typing or profiling is being widely used for criminal identification, paternity tests, and diagnosis of genetic diseases. DNA typing is considered one of the hardest problems in the forensic science domain, and it is an active area of research. The computational complexity of DNA typing increases significantly with the number of unknowns in the mixture and has been the major deterring factor holding its advancements and applications. In this chapter, we provide an extended review of DNA profiling methods and tools with a particular focus on their computational performance and accuracy. The process of DNA profiling within the broader context of forensic science and genetics is explained. The various classes of DNA profiling methods including general methods, and those based on maximum likelihood estimators, are reviewed. The reviewed DNA profiling tools include LRmix Studio, TrueAllele, DNAMIX V.3, Euroformix, CeesIt, NOCIt, DNAMixture, Kongoh, LikeLTD, LabRetriever, and STRmix. A review of high-performance computing literature in bioinformatics and HPC frameworks is also given. Faster interpretations of DNA mixtures with a large number of unknowns and higher accuracies are expected to open up new frontiers for this area.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The American Heritage medical dictionary. Houghton Mifflin Co., Boston (2007)

    Google Scholar 

  2. Butler, J.M.: Fundamentals of Forensic DNA Typing. Academic Press/Elsevier (2010)

    Google Scholar 

  3. Swaminathan, H., Grgicak, C.M., Medard, M., Lun, D.S.: NOCIt: a computational method to infer the number of contributors to DNA samples analyzed by STR genotyping. Forensic Sci. Int. Genet. 16, 172–180 (2015)

    Article  Google Scholar 

  4. Alamoudi, E., Mehmood, R., Albeshri, A., Gojobori, T.: DNA profiling methods and tools: a review. In: Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST. pp. 216–231. Springer, Cham (2018)

    Google Scholar 

  5. Arfat, Y., Aqib, M., Mehmood, R., Albeshri, A., Katib, I., Albogami, N., Alzahrani, A.: Enabling smarter societies through Mobile big data fogs and clouds. Procedia Comput. Sci. 109, 1128–1133 (2017)

    Article  Google Scholar 

  6. Alam, F., Mehmood, R., Katib, I., Albogami, N.N., Albeshri, A.: Data fusion and IoT for smart ubiquitous environments: a survey. IEEE Access. 5, 9533–9554 (2017)

    Article  Google Scholar 

  7. Mehmood, R., Alam, F., Albogami, N.N., Katib, I., Albeshri, A., Altowaijri, S.M.: UTiLearn: a personalised ubiquitous teaching and learning system for smart societies. IEEE Access. 5, 2615–2635 (2017)

    Article  Google Scholar 

  8. Butler, J.M.: The future of forensic DNA analysis. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 370, 577–579 (2015)

    Article  Google Scholar 

  9. Paoletti, D.R., Krane, D.E., Raymer, M.L., Doom, T.E.: Inferring the number of contributors to mixed DNA profiles. IEEE/ACM Trans. Comput. Biol. Bioinforma. 9, 113–122 (2012)

    Article  Google Scholar 

  10. Perez, J., Mitchell, A.A., Ducasse, N., Tamariz, J., Caragine, T.: Estimating the number of contributors to two-, three-, and four-person mixtures containing DNA in high template and low template amounts. Croat. Med. J. 52, 314–326 (2011)

    Article  Google Scholar 

  11. Gill, P., Haned, H.: A new methodological framework to interpret complex DNA profiles using likelihood ratios. Forensic Sci. Int. Genet. 7, 251–263 (2013)

    Article  Google Scholar 

  12. Weedn, V.W., Foran, D.R.: Forensic DNA typing. In: Molecular pathology in clinical practice. pp. 793–810. Springer International Publishing, Champions (2016)

    Chapter  Google Scholar 

  13. Monich, U.J., Grgicak, C., Cadambe, V., Wu, J.Y., Wellner, G., Duffy, K., Medard, M.: A signal model for forensic DNA mixtures. In: 2014 48th Asilomar Conference on Signals, Systems and Computers. pp. 429–433. IEEE (2014)

    Google Scholar 

  14. Tao, R., Wang, S., Zhang, J., Zhang, J., Yang, Z., Sheng, X., Hou, Y., Zhang, S., Li, C.: Separation/extraction, detection, and interpretation of DNA mixtures in forensic science (review)

    Google Scholar 

  15. Inman, K., Rudin, N., Cheng, K., Robinson, C., Kirschner, A., Inman-Semerau, L., Lohmueller, K.E.: Lab retriever: a software tool for calculating likelihood ratios incorporating a probability of drop-out for forensic DNA profiles. BMC Bioinformatics. 16, 298 (2015)

    Article  Google Scholar 

  16. Schmidt, B., Hildebrandt, A.: Next-generation sequencing: big data meets high performance computing. Drug Discov. Today. 22, 712–717 (2017)

    Article  Google Scholar 

  17. Chang, Y.-J., Chen, C.-C., Chen, C.-L., Ho, J.-M.: A de novo next generation genomic sequence assembler based on string graph and MapReduce cloud computing framework. BMC Genomics. 13 Suppl 7, S28 (2012)

    Google Scholar 

  18. Li, D., Liu, C.-M., Luo, R., Sadakane, K., Lam, T.-W.: MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 31, 1674–1676 (2015)

    Article  Google Scholar 

  19. Liu, Y., Schmidt, B., Maskell, D.L.: DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI. BMC Bioinformatics. 12, 85 (2011)

    Article  Google Scholar 

  20. Erbert, M., Rechner, S., Müller-Hannemann, M.: Gerbil: a fast and memory-efficient k-mer counter with GPU-support. Algorithms Mol. Biol. 12, 9 (2017)

    Article  Google Scholar 

  21. Varma, B.S.C., Paul, K., Balakrishnan, M., Lavenier, D.: FAssem: FPGA Based Acceleration of De Novo Genome Assembly. In: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines. pp. 173–176. IEEE (2013)

    Google Scholar 

  22. Ramachandran, A., Heo, Y., Hwu, W.M., Ma, J., Chen, D.: FPGA accelerated DNA error correction, https://iwe.pure.elsevier.com/en/publications/fpga-accelerated-dna-error-correction, (2015)

  23. Kang, S.J., Lee, S.Y., Lee, K.M.: Performance comparison of OpenMP, MPI, and MapReduce in practical problems. Adv. Multimed. 2015, 1–9 (2015)

    Article  Google Scholar 

  24. Hamidi, B., Hamidi, L.: Synchronization Possibilities and Features in Java, vol. 1, p. 75 (2015)

    Google Scholar 

  25. Carpenter, B., Getov, V., Judd, G., Skjellum, A., Fox, G.: MPJ: MPI-like message passing for Java. Concurr. Pract. Exp. 12, 1019–1038 (2000)

    Article  Google Scholar 

  26. Memeti, S., Pllana, S.: A machine learning approach for accelerating DNA sequence analysis. Int. J. High Perform. Comput. Appl. 1–17

    Google Scholar 

  27. Bell, G., Gray, J.: What’ S Next in Computing ? 45, 91–95 (2002)

    Google Scholar 

  28. Diegoli, T.M., Rohde, H., Borowski, S., Krawczak, M., Coble, M.D., Nothnagel, M.: Genetic mapping of 15 human X chromosomal forensic short tandem repeat (STR) loci by means of multi-core parallelization. Forensic Sci. Int. Genet. 25, 39 (2016)

    Article  Google Scholar 

  29. Laguna, I., Ahn, D.H., De Supinski, B.R., Gamblin, T., Lee, G.L., Schulz, M., Bagchi, S., Kulkarni, M., Zhou, B., Chen, Z., Qin, F.: Debugging high-performance computing applications at massive scales. Commun. ACM. 58, 72–81 (2015)

    Article  Google Scholar 

  30. Butler, J.M.: Advanced topics in forensic DNA typing: interpretation

    Google Scholar 

  31. Bille, T., Bright, J.-A., Buckleton, J.: Application of random match probability calculations to mixed STR profiles. J. Forensic Sci. 58, 474–485 (2013)

    Article  Google Scholar 

  32. Garofano, P., Caneparo, D., D’Amico, G., Vincenti, M., Alladio, E.: An alternative application of the consensus method to DNA typing interpretation for low template-DNA mixtures. Forensic Sci. Int. Genet. Suppl. Ser. 5, e422–e424 (2015)

    Article  Google Scholar 

  33. Kelly, H., Bright, J.-A., Buckleton, J.S., Curran, J.M.: A comparison of statistical models for the analysis of complex forensic DNA profiles. Sci. Justice. 54, 66–70 (2014)

    Article  Google Scholar 

  34. Bleka, Ø., Storvik, G., Gill, P.: EuroForMix: an open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts. Forensic Sci. Int. Genet. 21, 35 (2016)

    Article  Google Scholar 

  35. Perlin, M.W., Dormer, K., Hornyak, J., Schiermeier-Wood, L., Greenspoon, S.: TrueAllele casework on Virginia DNA mixture evidence: computer and manual interpretation in 72 reported criminal cases. PLoS One. 9, e92837 (2014)

    Article  Google Scholar 

  36. Gill, P., Haned, H., Eduardoff, M., Santos, C., Phillips, C., Parson, W.: The Open-source software LRmix can be used to analyse SNP mixtures. Forensic Sci. Int. Genet. Suppl. Ser. 5, e50 (2015)

    Article  Google Scholar 

  37. Swaminathan, H., Garg, A., Grgicak, C.M., Medard, M., Lun, D.S.: CEESIt: a computational tool for the interpretation of STR mixtures. Forensic Sci. Int. Genet. 22, 149–160 (2016)

    Article  Google Scholar 

  38. Balding, D.J., Steele, C.: The likeLTD software: an illustrative analysis, explanation of the model, results of performance tests and version history. UCL Genet. Inst. 1, 1–49 (2014)

    Google Scholar 

  39. Moretti, T.R., Just, R.S., Kehl, S.C., Willis, L.E., Buckleton, J.S., Bright, J.-A., Taylor, D.A., Onorato, A.J.: Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles. Forensic Sci. Int. Genet. 29, 126–144 (2017)

    Article  Google Scholar 

  40. Taylor, D., Bright, J.-A., Buckleton, J.: Interpreting forensic DNA profiling evidence without specifying the number of contributors. Forensic Sci. Int. Genet. 13, 269–280 (2014)

    Article  Google Scholar 

  41. Russell, D., Christensen, W., Lindsey, T.: A simple unconstrained semi-continuous model for calculating likelihood ratios for complex DNA mixtures. Forensic Sci. Int. Genet. Suppl. Ser. 5, e37–e38 (2015)

    Article  Google Scholar 

  42. Paoletti, D.R., Doom, T.E., Krane, C.M., Raymer, M.L., Krane, D.E.: Empirical analysis of the STR profiles resulting from conceptual mixtures. J. Forensic Sci. 50, JFS2004475–JFS2004476 (2005)

    Article  Google Scholar 

  43. Biedermann, A., Bozza, S., Konis, K., Taroni, F.: Inference about the number of contributors to a DNA mixture: comparative analyses of a Bayesian network approach and the maximum allele count method. Forensic Sci. Int. Genet. 6, 689–696 (2012)

    Article  Google Scholar 

  44. Haned, H., Pène, L., Sauvage, F., Pontier, D.: The predictive value of the maximum likelihood estimator of the number of contributors to a DNA mixture. Forensic Sci. Int. Genet. 5, 281–284 (2011)

    Article  Google Scholar 

  45. Haned, H., Pène, L., Lobry, J.R., Dufour, A.B., Pontier, D.: Estimating the number of contributors to forensic DNA mixtures: does maximum likelihood perform better than maximum allele count? J. Forensic Sci. 56, 23–28 (2011)

    Article  Google Scholar 

  46. Haned, H., Benschop, C.C.G., Gill, P.D., Sijen, T.: Complex DNA mixture analysis in a forensic context: evaluating the probative value using a likelihood ratio model. Forensic Sci. Int. Genet. 16, 17–25 (2015)

    Article  Google Scholar 

  47. Egeland, T., Dalen, I., Mostad, P.F.: Estimating the number of contributors to a DNA profile. Int. J. Legal Med. 117, 271–275 (2003)

    Article  Google Scholar 

  48. Marciano, M.A., Adelman, J.D.: PACE: probabilistic assessment for contributor estimation— a machine learning-based assessment of the number of contributors in DNA mixtures. Forensic Sci. Int. Genet. 27, 82–91 (2017)

    Article  Google Scholar 

  49. Curran, J.M., Triggs, C.M., Buckleton, J., Weir, B.S.: Interpreting DNA mixtures in structured populations. J. Forensic Sci. 44, 987–995 (1999)

    Google Scholar 

  50. Haned, H., De Jong, J.: LRmix Studio 2.1 user manual. (2016)

    Google Scholar 

  51. Graversen, T.: Statistical and Computational Methodology for the Analysis of Forensic DNA Mixtures with Artefacts, https://ora.ox.ac.uk/objects/uuid:4c3bfc88-25e7-4c5b-968f-10a35f5b82b0, (2014)

  52. Forensim: An open-source initiative for the evaluation of statistical methods in forensic genetics. Forensic Sci. Int. Genet. 5, 265–268 (2011)

    Google Scholar 

  53. Gill, P., Sparkes, R., Pinchin, R., Clayton, T., Whitaker, J., Buckleton, J.: Interpreting simple STR mixtures using allele peak areas. Forensic Sci. Int. 91, 41–53 (1998)

    Article  Google Scholar 

  54. Kling, D., Egeland, T., Tillmar, A.O.: FamLink – a user friendly software for linkage calculations in family genetics. Forensic Sci. Int. Genet. 6, 616–620 (2012)

    Article  Google Scholar 

  55. Tvedebrink, T., Eriksen, P.S., Mogensen, H.S., Morling, N.: Evaluating the weight of evidence by using quantitative short tandem repeat data in DNA mixtures. J. R. Stat. Soc. Ser. C Applied Stat. 59, 855–874 (2010)

    Article  MathSciNet  Google Scholar 

  56. Developmental validation of STRmix™, expert software for the interpretation of forensic DNA profiles. Forensic Sci. Int. Genet. 23, 226–239 (2016)

    Google Scholar 

  57. Perlin, M.W., Hornyak, J.M., Sugimoto, G., Miller, K.W.: TrueAllele genotype identification on DNA mixtures containing up to five unknown contributors*, vol. 60, p. 857 (2015)

    Google Scholar 

  58. Cowell, R.G., Graversen, T., Lauritzen, S.L., Mortera, J.: Analysis of forensic DNA mixtures with artefacts. J. R. Stat. Soc. Ser. C Applied Stat., 64. 1–48 (2015)

    MathSciNet  Google Scholar 

  59. Manabe, S., Morimoto, C., Hamano, Y., Fujimoto, S., Tamaki, K.: Development and validation of open-source software for DNA mixture interpretation based on a quantitative continuous model. PLoS One. 12, e0188183 (2017)

    Article  Google Scholar 

  60. Bleka, Ø.: An introduction to EuroForMix (v1.8). 2016, 1–59 (2016)

    Google Scholar 

  61. Manabe, S.: Kongoh version 1.0.1 User Manual. 1–12 (2017)

    Google Scholar 

  62. Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. Comput. Lab. Univ. 22 (2005)

    Google Scholar 

  63. Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Validation of Stochastic Systems. pp. 230–255. Springer, Berlin, (2004)

    Google Scholar 

  64. Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. In: 2010 International Conference on Intelligent Systems, Modelling and Simulation. pp. 431–436. IEEE (2010)

    Google Scholar 

  65. Mehmood, R., Crowcroft, J., Hand, S., Smith, S.: Grid-level computing needs pervasive debugging. In: The 6th IEEE/ACM International Workshop on Grid Computing, 2005. p. 8 pp. IEEE (2005)

    Google Scholar 

  66. Tawalbeh, L.A., Mehmood, R., Benkhlifa, E., Song, H.: Mobile cloud computing model and big data analysis for healthcare applications. IEEE Access. 4, 6171–6180 (2016)

    Article  Google Scholar 

  67. Tawalbeh, L.A., Bakhader, W., Mehmood, R., Song, H.: Cloudlet-Based Mobile Cloud Computing for Healthcare Applications. In: 2016 IEEE Global Communications Conference (GLOBECOM). pp. 1–6. IEEE (2016)

    Google Scholar 

  68. Muhammed, T., Mehmood, R., Albeshri, A., Katib, I.: UbeHealth: A personalized ubiquitous cloud and edge-enabled networked healthcare system for smart cities, https://ieeexplore.ieee.org/document/8382164/, (2018)

    Article  Google Scholar 

Download references

Acknowledgments

The work carried out in this chapter is supported by the HPC Center at the King Abdulaziz University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emad Alamoudi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Alamoudi, E., Mehmood, R., Albeshri, A., Gojobori, T. (2020). A Survey of Methods and Tools for Large-Scale DNA Mixture Profiling. In: Mehmood, R., See, S., Katib, I., Chlamtac, I. (eds) Smart Infrastructure and Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-13705-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13705-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13704-5

  • Online ISBN: 978-3-030-13705-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics