Skim-Based Genotyping by Sequencing Using a Double Haploid Population to Call SNPs, Infer Gene Conversions, and Improve Genome Assemblies

  • Philipp Emanuel BayerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1374)


Genotyping by sequencing (GBS) is an emerging technology to rapidly call an abundance of Single Nucleotide Polymorphisms (SNPs) using genome sequencing technology. Several different methodologies and approaches have recently been established, most of these relying on a specific preparation of data. Here we describe our GBS-pipeline, which uses high coverage reads from two parents and low coverage reads from their double haploid offspring to call SNPs on a large scale. The upside of this approach is the high resolution and scalability of the method.

Key words

Genotyping by sequencing SNPs SNPcalling Bioinformatics Genotyping 



The author acknowledges funding support from the Australian Research Council (Project LP1f10100200). Support is also acknowledged from the Queensland Cyber Infrastructure Foundation (QCIF) and the Australian Partnership for Advanced Computing (APAC).


  1. 1.
    Miller MR, Dunham JP, Amores A et al (2007) Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers. Genome Res 17:240–248. doi: 10.1101/gr.5681207 PubMedCentralCrossRefPubMedGoogle Scholar
  2. 2.
    Davey JW, Cezard T, Fuentes-Utrilla P et al (2013) Special features of RAD sequencing data: implications for genotyping. Mol Ecol 22:3151–3164. doi: 10.1111/mec.12084 PubMedCentralCrossRefPubMedGoogle Scholar
  3. 3.
    Poland JA, Brown PJ, Sorrells ME, Jannink J-L (2012) Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. PLoS One 7:e32253. doi: 10.1371/journal.pone.0032253 PubMedCentralCrossRefPubMedGoogle Scholar
  4. 4.
    Chen Y-C, Liu T, Yu C-H et al (2013) Effects of GC bias in next-generation-sequencing data on de novo genome assembly. PLoS One 8:e62856. doi: 10.1371/journal.pone.0062856 PubMedCentralCrossRefPubMedGoogle Scholar
  5. 5.
    Carneiro MO, Russ C, Ross MG et al (2012) Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics 13:375. doi: 10.1186/1471-2164-13-375 PubMedCentralCrossRefPubMedGoogle Scholar
  6. 6.
    Huang X, Feng Q, Qian Q et al (2009) High-throughput genotyping by whole-genome resequencing. Genome Res 19:1068–1076. doi: 10.1101/gr.089516.108 PubMedCentralCrossRefPubMedGoogle Scholar
  7. 7.
    Yang S, Yuan Y, Wang L et al (2012) Great majority of recombination events in Arabidopsis are gene conversion events. Proc Natl Acad Sci U S A 109:20992–20997. doi: 10.1073/pnas.1211827110 PubMedCentralCrossRefPubMedGoogle Scholar
  8. 8.
    Truong HT, Ramos AM, Yalcin F et al (2012) Sequence-based genotyping for marker discovery and co-dominant scoring in germplasm and populations. PLoS One 7:e37565. doi: 10.1371/journal.pone.0037565 PubMedCentralCrossRefPubMedGoogle Scholar
  9. 9.
    Li R, Yu C, Li Y et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967. doi: 10.1093/bioinformatics/btp336 CrossRefPubMedGoogle Scholar
  10. 10.
    Li H, Handsaker B, Wysoker A et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 PubMedCentralCrossRefPubMedGoogle Scholar
  11. 11.
    Lorenc MT, Hayashi S, Stiller J et al (2012) Discovery of single nucleotide polymorphisms in complex genomes using SGSautoSNP. Biology 1:370–382. doi: 10.3390/biology1020370 PubMedCentralCrossRefPubMedGoogle Scholar
  12. 12.
    Milne I, Shaw P, Stephen G et al (2010) Flapjack—graphical genotype visualization. Bioinformatics 26:3133–3134. doi: 10.1093/bioinformatics/btq580 PubMedCentralCrossRefPubMedGoogle Scholar
  13. 13.
    Milne I, Bayer M, Cardle L et al (2010) Tablet—next generation sequence assembly visualization. Bioinformatics 26:401–402. doi: 10.1093/bioinformatics/btp666 PubMedCentralCrossRefPubMedGoogle Scholar
  14. 14.
    Scott LJ, Mohlke KL, Bonnycastle LL et al (2007) A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316:1341–1345. doi: 10.1126/science.1142382 PubMedCentralCrossRefPubMedGoogle Scholar
  15. 15.
    Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5:e1000529. doi: 10.1371/journal.pgen.1000529 PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23:1061–1067. doi: 10.1093/bioinformatics/btm071 CrossRefPubMedGoogle Scholar
  17. 17.
    Hunt M, Kikuchi T, Sanders M et al (2013) REAPR: a universal tool for genome assembly evaluation. Genome Biol 14:R47. doi: 10.1186/gb-2013-14-5-r47 PubMedCentralCrossRefPubMedGoogle Scholar
  18. 18.
    Hoffmann S, Otto C, Kurtz S et al (2009) Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol 5:e1000502. doi: 10.1371/journal.pcbi.1000502 PubMedCentralCrossRefPubMedGoogle Scholar
  19. 19.
    Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111. doi: 10.1093/bioinformatics/btp120 PubMedCentralCrossRefPubMedGoogle Scholar
  20. 20.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi: 10.1093/bioinformatics/btp324 PubMedCentralCrossRefPubMedGoogle Scholar
  21. 21.
    Yu X, Sun S (2013) Comparing a few SNP calling algorithms using low-coverage sequencing data. BMC bioinformatics 14:274. doi: 10.1186/1471-2105-14-274 PubMedCentralCrossRefPubMedGoogle Scholar
  22. 22.
    Farrer RA, Henk DA, MacLean D et al (2013) Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 3:1512. doi: 10.1038/srep01512 PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Plant BiologyUniversity of Western AustraliaPerthAustralia

Personalised recommendations