Skip to main content

Error Correction in Methylation Profiling From NGS Bisulfite Protocols

  • Chapter
  • First Online:

Abstract

Whole genome bisulfite sequencing (WGBS) has emerged as the primary technique for DNA methylation studies, because of its great potential in terms of speed, specificity, and the capability of addressing new biological implications as non-CpG context methylation or hemimethylation. However, despite the improvement that has meant the appearance of WGBS, processing and analyzing the resulting datasets is not as straightforward as in other methylation assays, and special care should be taken to obtain reliable results. As far as we know, an extensive review on the error sources that can bias methylation level measurement and the different algorithms that have been proposed to deal with it does not exist. Therefore, in this chapter all known WGBS error sources will be extensively reviewed and critically evaluated in order to suggest a couple of best practices to deal with all sources of bias in WGBS assays.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bonasio, R., Tu, S., Reinberg, D.: Molecular signals of epigenetic states. Science 330(6004), 612–616 (2010)

    Article  Google Scholar 

  2. Lister, R., Ecker, J.R.: Finding the fifth base: genome-wide sequencing of cytosine methylation. Genome Res. 19(6), 959–966 (2009)

    Article  Google Scholar 

  3. Jones, P.A.: Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13(7), 484–492 (2012)

    Article  Google Scholar 

  4. Hotchkiss, R.D.: The quantitative separation of purines, pyrimidines, and nucleosides by paper chromatography. J. Biol. Chem. 175(1), 315–332 (1948)

    Google Scholar 

  5. Riggs, A.D.: X inactivation, differentiation, and DNA methylation. Cytogenet. Cell Genet. 14(1), 9–25 (1975)

    Article  Google Scholar 

  6. Holliday, R., Pugh, J.E.: DNA modification mechanisms and gene activity during development. Science 187(4173), 226–232 (1975)

    Article  Google Scholar 

  7. Laird, P.W.: Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 11(3), 191–203 (2010)

    Article  Google Scholar 

  8. Frommer, M., McDonald, L.E., Millar, D.S., Collis, C.M., Watt, F., Grigg, G.W., Molloy, P.L., Paul, C.L.: A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc. Natl. Acad. Sci. U. S. A. 89(5), 1827–1831 (1992)

    Article  Google Scholar 

  9. Xi, Y., Li, W.: Bsmap: whole genome bisulfite sequence mapping program. BMC Bioinf. 10, 232 (2009)

    Article  Google Scholar 

  10. Chen, P.Y., Cokus, S.J., Pellegrini, M.: Bs seeker: precise mapping for bisulfite sequencing. BMC Bioinf. 11, 203 (2010)

    Article  Google Scholar 

  11. Guo, W., Fiziev, P., Yan, W., Cokus, S., Sun, X., Zhang, M.Q., Chen, P.Y., Pellegrini, M.: Bs-seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics 14, 774 (2013)

    Article  Google Scholar 

  12. Hach, F., Hormozdiari, F., Alkan, C., Hormozdiari, F., Birol, I., Eichler, E.E., Sahinalp, S.C.: mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat. Methods 7(8), 576–577 (2010)

    Google Scholar 

  13. Krueger, F., Andrews, S.R.: Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27(11), 1571–1572 (2011)

    Article  Google Scholar 

  14. Pedersen, B., Hsieh, T.F., Ibarra, C., Fischer, R.L.: Methylcoder: software pipeline for bisulfite-treated sequences. Bioinformatics 27(17), 2435–2436 (2011)

    Article  Google Scholar 

  15. Hackenberg, M., Barturen, G., Oliver, J.L.: In: Tatarinova, T. (ed.) DNA Methylation Profiling from High-Throughput Sequencing Data, DNA Methylation - From Genomics to Technology, InTech (2012). doi:10.5772/34825

  16. Chatterjee, A., Stockwell, P.A., Rodger, E.J., Morison, I.M.: Comparison of alignment software for genome-wide bisulphite sequence data. Nucleic Acids Res. 40(10), e79 (2012)

    Article  Google Scholar 

  17. Frith, M.C., Mori, R., Asai, K.: A mostly traditional approach improves alignment of bisulfite-converted DNA. Nucleic Acids Res. 40(13), e100 (2012)

    Article  Google Scholar 

  18. Kunde-Ramamoorthy, G., Coarfa, C., Laritsky, E., Kessler, N.J., Harris, R.A., Xu, M., Chen, R., Shen, L., Milosavljevic, A., Waterland, R.A.: Comparison and quantitative verification of mapping algorithms for whole-genome bisulfite sequencing. Nucleic Acids Res. 42(6), e43 (2014)

    Article  Google Scholar 

  19. Schultz, M.D., Schmitz, R.J., Ecker, J.R.: ‘leveling’ the playing field for analyses of single-base resolution DNA methylomes. Trends Genet. 28(12), 583–585 (2012)

    Google Scholar 

  20. Beck, S., Rakyan, V.K.: The methylome: approaches for global DNA methylation profiling. Trends Genet. 24(5), 231–237 (2008)

    Article  Google Scholar 

  21. Krueger, F., Kreck, B., Franke, A., Andrews, S.R.: DNA methylome analysis using short bisulfite sequencing data. Nat. Methods 9(2), 145–151 (2012)

    Article  Google Scholar 

  22. Cokus, S.J., Feng, S., Zhang, X., Chen, Z., Merriman, B., Haudenschild, C.D., Pradhan, S., Nelson, S.F., Pellegrini, M., Jacobsen, S.E.: Shotgun bisulphite sequencing of the arabidopsis genome reveals DNA methylation patterning. Nature 452(7184), 215–219 (2008)

    Article  Google Scholar 

  23. Meissner, A., Gnirke, A., Bell, G.W., Ramsahoye, B., Lander, E.S., Jaenisch, R.: Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 33(18), 5868–5877 (2005)

    Article  Google Scholar 

  24. Hansen, K.D., Langmead, B., Irizarry, R.A.: Bsmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13(10), R83 (2012)

    Article  Google Scholar 

  25. Andrews, S.: FastQC: a quality control application for fastq data (2010). Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

  26. Hannon: Fastx-toolkit (2009)

    Google Scholar 

  27. Martin, M.: Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17(1), 10–12 (2011)

    Article  Google Scholar 

  28. Bolger, A.M., Lohse, M., Usadel, B.: Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15), 2114–2120 (2014)

    Article  Google Scholar 

  29. Hansen, K.D., Brenner, S.E., Dudoit, S.: Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 38(12), e131 (2010)

    Article  Google Scholar 

  30. Schwartz, S., Oren, R., Ast, G.: Detection and removal of biases in the analysis of next-generation sequencing reads. PLoS One 6(1), e16685 (2011)

    Article  Google Scholar 

  31. Poptsova, M.S., Il’icheva, I.A., Nechipurenko, D.Y., Panchenko, L.A., Khodikov, M.V., Oparina, N.Y., Polozov, R.V., Nechipurenko, Y.D., Grokhovsky, S.L.: Non-random DNA fragmentation in next-generation sequencing. Sci. Rep. 4, 4532 (2014)

    Article  Google Scholar 

  32. Aird, D., Ross, M.G., Chen, W.S., Danielsson, M., Fennell, T., Russ, C., Jaffe, D.B., Nusbaum, C., Gnirke, A.: Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12(2), R18 (2011)

    Article  Google Scholar 

  33. Benjamini, Y., Speed, T.P.: Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 40(10), e72 (2012)

    Article  Google Scholar 

  34. Miura, F., Enomoto, Y., Dairiki, R., Ito, T.: Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging. Nucleic Acids Res. 40(17), e136 (2012)

    Article  Google Scholar 

  35. Ziller, M.J., Hansen, K.D., Meissner, A., Aryee, M.J.: Coverage recommendations for methylation analysis by whole-genome bisulfite sequencing. Nat. Methods 12(3), 230–232 (2015)

    Article  Google Scholar 

  36. Kozarewa, I., Ning, Z., Quail, M.A., Sanders, M.J., Berriman, M., Turner, D.J.: Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (g+c)-biased genomes. Nat. Methods 6(4), 291–295 (2009)

    Article  Google Scholar 

  37. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., Subgroup Genome Project Data Processing: The sequence alignment/map format and samtools. Bioinformatics 25(16), 2078–2079 (2009)

    Google Scholar 

  38. Broad-Institute: A set of tools for working with next generation sequencing data in the BAM. Available online at: http://broadinstitute.github.io/picard/

  39. Barturen, G., Rueda, A., Oliver, J.L., Hackenberg, M.: MethylExtract: high-quality methylation maps and SNV calling from whole genome bisulfite sequencing data. F1000Res 2, 217 (2013)

    Google Scholar 

  40. Cock, P.J., Fields, C.J., Goto, N., Heuer, M.L., Rice, P.M.: The sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 38(6), 1767–1771 (2010)

    Article  Google Scholar 

  41. James Kent, W., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., Haussler, D.: The human genome browser at UCSC. Genome Res. 12(6), 996–1006 (2002)

    Article  Google Scholar 

  42. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)

    Article  Google Scholar 

  43. Li, H.: Improving SNP discovery by base alignment quality. Bioinformatics 27(8), 1157–1158 (2011)

    Article  Google Scholar 

  44. Langmead, B., Salzberg, S.L.: Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357–359 (2012)

    Article  Google Scholar 

  45. Fuller, C.W., Middendorf, L.R., Benner, S.A., Church, G.M., Harris, T., Huang, X., Jovanovich, S.B., Nelson, J.R., Schloss, J.A., Schwartz, D.C., Vezenov, D.V.: The challenges of sequencing by synthesis. Nat. Biotechnol. 27(11), 1013–1023 (2009)

    Article  Google Scholar 

  46. Taub, M.A., Corrada Bravo, H., Irizarry, R.A.: Overcoming bias and systematic errors in next generation sequencing data. Genome Med. 2(12), 87 (2010)

    Article  Google Scholar 

  47. Del Fabbro, C., Scalabrin, S., Morgante, M., Giorgi, F.M.: An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS One 8(12), e85024 (2013)

    Article  Google Scholar 

  48. Minoche, A.E., Dohm, J.C., Himmelbauer, H.: Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol. 12(11), R112 (2011)

    Article  Google Scholar 

  49. Liu, Y., Siegmund, K.D., Laird, P.W., Berman, B.P.: Bis-SNP: combined DNA methylation and SNP calling for Bisulfite-seq data. Genome Biol. 13(7), R61 (2012)

    Article  Google Scholar 

  50. DePristo, M.A., Banks, E., Poplin, R., Garimella, K.V., Maguire, J.R., Hartl, C., Philippakis, A.A., del Angel, G., Rivas, M.A., Hanna, M., McKenna, A., Fennell, T.J., Kernytsky, A.M., Sivachenko, A.Y., Cibulskis, K., Gabriel, S.B., Altshuler, D., Daly, M.J.: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43(5), 491–498 (2011)

    Article  Google Scholar 

  51. Lister, R., Pelizzola, M., Dowen, R.H., Hawkins, R.D., Hon, G., Tonti-Filippini, J., Nery, J.R., Lee, L., Ye, Z., Ngo, Q.M., Edsall, L., Antosiewicz-Bourget, J., Stewart, R., Ruotti, V., Millar, A.H., Thomson, J.A., Ren, B., Ecker, J.R.: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462(7271), 315–322 (2009)

    Article  Google Scholar 

  52. Hon, G.C., Hawkins, R.D., Caballero, O.L., Lo, C., Lister, R., Pelizzola, M., Valsesia, A., Ye, Z., Kuan, S., Edsall, L.E., et al.: Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genet. Res. 22(2), 246–258 (2012)

    Article  Google Scholar 

  53. Ziller, M.J., Gu, H., Muller, F., Donaghey, J., Tsai, L.T., Kohlbacher, O., De Jager, P.L., Rosen, E.D., Bennett, D.A., Bernstein, B.E., Gnirke, A., Meissner, A.: Charting a dynamic DNA methylation landscape of the human genome. Nature 500(7463), 477–481 (2013)

    Article  Google Scholar 

  54. Lin, X., Sun, D., Rodriguez, B., Zhao, Q., Sun, H., Zhang, Y., Li, W.: Bseqc: quality control of bisulfite sequencing experiments. Bioinformatics 29(24), 3227–3229 (2013)

    Article  Google Scholar 

  55. Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., Sirotkin,K.: dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29(1), 308–311 (2001)

    Google Scholar 

  56. Consortium Genomes Project, Abecasis, G.R., Altshuler, D., Auton, A., Brooks, L.D., Durbin, R.M., Gibbs, R.A., Hurles, M.E., McVean, G.A.: A map of human genome variation from population-scale sequencing. Nature 467(7319), 1061–1073 (2010)

    Google Scholar 

  57. Weisenberger, D.J., Campan, M., Long, T.I., Kim, M., Woods, C., Fiala, E., Ehrlich, M., Laird, P.W.: Analysis of repetitive element DNA methylation by methylight. Nucleic Acids Res. 33(21), 6823–6836 (2005)

    Article  Google Scholar 

  58. McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K., Altshuler, D., Gabriel, S., Daly, M., DePristo, M.A.: The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9), 1297–1303 (2010)

    Article  Google Scholar 

  59. Koboldt, D.C., Chen, K., Wylie, T., Larson, D.E., McLellan, M.D., Mardis, E.R., Weinstock, G.M., Wilson, R.K., Ding, L.: Varscan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25(17), 2283–2285 (2009)

    Article  Google Scholar 

  60. Seisenberger, S., Andrews, S., Krueger, F., Arand, J., Walter, J., Santos, F., Popp, C., Thienpont, B., Dean, W., Reik, W.: The dynamics of genome-wide DNA methylation reprogramming in mouse primordial germ cells. Mol. Cell 48(6), 849–862 (2012)

    Article  Google Scholar 

  61. Iacobazzi, V., Castegna, A., Infantino, V., Andria, G.: Mitochondrial DNA methylation as a next-generation biomarker and diagnostic tool. Mol. Genet. Metab. 110(1–2), 25–34 (2013)

    Article  Google Scholar 

  62. Guo, J.U., Su, Y., Shin, J.H., Shin, J., Li, H., Xie, B., Zhong, C., Hu, S., Le, T., Fan, G., Zhu, H., Chang, Q., Gao, Y., Ming, G.L., Song, H.: Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain. Nat. Neurosci. 17(2), 215–222 (2014)

    Article  Google Scholar 

  63. Guo, W., Chung, W.Y., Qian, M., Pellegrini, M., Zhang, M.Q.: Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells. Nucleic Acids Res. 42(5), 3009–3016 (2014)

    Article  Google Scholar 

  64. Stadler, M.B., Murr, R., Burger, L., Ivanek, R., Lienert, F., Scholer, A., van Nimwegen, E., Wirbelauer, C., Oakeley, E.J., Gaidatzis, D., Tiwari, V.K., Schubeler, D.: DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480(7378), 490–495 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillermo Barturen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Barturen, G., Oliver, J.L., Hackenberg, M. (2017). Error Correction in Methylation Profiling From NGS Bisulfite Protocols. In: Elloumi, M. (eds) Algorithms for Next-Generation Sequencing Data. Springer, Cham. https://doi.org/10.1007/978-3-319-59826-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59826-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59824-6

  • Online ISBN: 978-3-319-59826-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics