Skip to main content

Identification of Factors that Affect Reproducibility of Mutation Calling Methods in Data Originating from the Next-Generation Sequencing

  • Conference paper
  • First Online:
Computer and Information Sciences (ISCIS 2018)

Abstract

Identification of somatic mutations, based on data from next-generation sequencing of the DNA, has become one of the fundamental research strategies in oncology, with the goal to seek mechanisms underlying the process of carcinogenesis and resistance to commonly used therapies. Despite significant advances in the development of sequencing methods and data processing algorithms, the reproducibility of experiments is relatively low and depending significantly on the methods used to identify changes in the structure of the DNA. This is mainly due to the influence of three factors: (1) high heterogeneity of tumors due to which some mutations are characteristic for a small number of cells, (2) bias associated with the process of exome isolation and (3) specificity of data pre-processing strategies.

The aim of the work was to determine the impact of these factors on the identification of somatic mutations, allowing to determine the reasons for low reproducibility in such studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Luo, J., Wu, M., Gopukumar, D., Zhao, Y.: Big data application in biomedical research and health care: a literature review. Biomed. Inf. Insights 8, 1–10 (2016)

    Google Scholar 

  2. Bensz, W., et al.: Integrated System supporting research on environment related cancers. In: Król, D., Madeyski, L., Nguyen, N.T. (eds.) Recent Developments in Intelligent Information and Database Systems. SCI, vol. 642, pp. 399–409. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31277-4_35

    Chapter  Google Scholar 

  3. Psiuk-Maksymowicz, K., et al.: A holistic approach to testing biomedical hypotheses and analysis of biomedical data. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015-2016. CCIS, vol. 613, pp. 449–462. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34099-9_34

    Chapter  Google Scholar 

  4. Afgan, E., Baker, D., van den Beek, M., Blankenberg, D., Bouvier, D., Cech, M., Chilton, J.: The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44(W1), W3–W10 (2016)

    Article  Google Scholar 

  5. Psiuk-Maksymowicz, K., Mrozek, D., Jaksik, R., Borys, D., Fujarewicz, K., Swierniak, A.: Scalability of a genomic data analysis in the biotest platform. In: Nguyen, N.T., Tojo, S., Nguyen, L.M., Trawiński, B. (eds.) ACIIDS 2017. LNCS (LNAI), vol. 10192, pp. 741–752. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54430-4_71

    Chapter  Google Scholar 

  6. Gruca, A., Jaksik, R., Psiuk-Maksymowicz, K.: Functional interpretation of gene sets: semantic-based clustering of gene ontology terms on the biotest platform. In: Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds.) ICMMI 2017. AISC, vol. 659, pp. 125–136. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67792-7_13

    Chapter  Google Scholar 

  7. Gerlinger, M., Rowan, A.J., Horswell, S., Larkin, J., Endesfelder, D., Gronroos, E., Martinez, P., Matthews, N.: Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883–892 (2012)

    Article  Google Scholar 

  8. Shi, W., Ng, C.K.Y., Lim, R.S., Jiang, T., Kumar, S., Li, X., Wali, V.B., Piscuoglio, S., Gerstein, M.B., Chagpar, A.B., Weigelt, B., Pusztai, L., Reis-Filho, J.S., Hatzis, C.: Reliability of whole-exome sequencing for assessing intratumor genetic heterogeneity. bioRxiv (2018)

    Google Scholar 

  9. Derryberry, D.Z., Cowperthwaite, M.C., Wilke, C.O.: Reproducibility of SNV-calling in multiple sequencing runs from single tumors. PeerJ 4, e1508 (2016)

    Article  Google Scholar 

  10. Qi, Y., Liu, X., Liu, C., Wang, B., Hess, K.R., Symmans, W.F., Shi, W., Pusztai, L.: Reproducibility of variant calls in replicate next generation sequencing experiments. PLoS One 7, e0119230 (2015)

    Article  Google Scholar 

  11. Meynert, A.M., Ansari, M., FitzPatrick, D.R., Taylor, M.S.: Variant detection sensitivity and biases in whole genome and exome sequencing. BMC Bioinform. 15, 247 (2014)

    Article  Google Scholar 

  12. Li, H.: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv.org p. arXiv:1303.3997 (2013)

  13. Cibulskis, C., Lawrence, M.S., Carter, S.L., Sivachenko, A., Jaffe, D., Sougnez, C., Gabriel, S., Meyerson, M., Lander, E.S., Getz, G.: Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013)

    Article  Google Scholar 

  14. Metzker, M.L.: Sequencing technologies – the next generation. Nat. Rev. Genet. 11(1), 31–46 (2010)

    Article  Google Scholar 

  15. McLaren, W., Gil, L., Hunt, S.E., Riat, H.S., Ritchie, G.R., Thormann, A., Flicek, P., Cunningham, F.: The ensembl variant effect predictor. Genome Biol 17(1), 122 (2016)

    Article  Google Scholar 

  16. Jaksik, R., Marczyk, M., Polanska, J., Rzeszowska-Wolny, J.: Sources of high variance between probe signals in affymetrix short oligonucleotide microarrays. Sensors 14, 532–548 (2014)

    Article  Google Scholar 

  17. Vissers, L., van Nimwegen, K., Schieving, J., Kamsteeg, E., Kleefstra, T., Yntema, H., Pfundt, R., van der Wilt, G.J., Krabbenborg, L., Brunner, H., van der Burg, S., Grutters, J., Veltman, J., Willemsen, M.: A clinical utility study of exome sequencing versus conventional genetic testing in pediatric neurology. Genet. Med. 19, 1055–1063 (2017)

    Article  Google Scholar 

  18. Bamshad, M.J., Ng, S.B., Bigham, A.W., Tabor, H.K., Emond, M.J., Nickerson, D.A., Shendure, J.: Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Centre for Research and Development grant No. Strategmed2/267398/4/NCBR/2015 (KPM), the National Science Centre grant No. 2016/23/D/ST7/03665 (RJ), and by internal grant of Institute of Automatic Control BK-204/RAu1/2017 (AS).

Calculations were carried out by means of the infrastructure of the Ziemowit computer cluster (www.ziemowit.hpc.polsl.pl) in the Laboratory of Bioinformatics and Computational Biology, The Biotechnology, Bioengineering and Bioinformatics Centre Silesian BIO-FARMA, created in the POIG.02.01.00-00-166/08 and expanded in the POIG.02.03.01-00-040/13 projects.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roman Jaksik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jaksik, R., Psiuk-Maksymowicz, K., Swierniak, A. (2018). Identification of Factors that Affect Reproducibility of Mutation Calling Methods in Data Originating from the Next-Generation Sequencing. In: Czachórski, T., Gelenbe, E., Grochla, K., Lent, R. (eds) Computer and Information Sciences. ISCIS 2018. Communications in Computer and Information Science, vol 935. Springer, Cham. https://doi.org/10.1007/978-3-030-00840-6_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00840-6_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00839-0

  • Online ISBN: 978-3-030-00840-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics