pp 1-43 | Cite as

Genotyping and Sequencing Technologies in Population Genetics and Genomics

Chapter
Part of the Population Genomics book series

Abstract

Genotypes are the central data to any population genetic and genomic study, and genotyping methods have steadily evolved since the first direct glimpses of genetic variation were enabled through enzyme protein electrophoresis. Following the development of the polymerase chain reaction, allozymes were supplanted by methods that directly measured allelic variation in nuclear and organellar DNA, most notably through the use of restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs), and microsatellites. At the turn of the millennium, genome-scale polymorphism detection and scoring still was hampered by the low-throughput nature of Sanger sequencing. This limitation changed with the advent of genotyping microarrays that at first yielded hundreds of data points per sample – a revolution at the time – and that subsequently improved to the point where hundreds of thousands of genetic variants could be scored simultaneously. These methods suffered a major flaw, however, in that their cost put them out of reach for studies of most ecologically important but economically unimportant species. The democratization of population genomics arrived with the advent of high-throughput, short-read sequencers and subsequent development of DNA library techniques to subsample the genome in a large number of individuals. Today, such methods – genotyping-by-sequencing, restriction site-associated DNA sequencing, RNA sequencing, and sequence capture – have become mainstays of the population geneticist’s toolkit. Refinements to existing library and sequencing methods continue to emerge at a rapid pace, and novel sequencing platforms may soon put the gold standard of long-read, genome-wide coverage within a broader reach. In this chapter, we comprehensively review genotyping methods used in population genetics, beginning with allozymes and progressing through AFLPs, microsatellites, and SNP arrays. We subsequently turn to a detailed discussion of methods that leverage next-generation technologies to enable truly genome-scale genotyping. Finally, we discuss recent developments and emerging technologies that constitute the “third wave” of sequencing and genotyping methods. Throughout, our aim is to provide methodological details that will be of use to population geneticists.

Keywords

Ecological genomics Genotyping by sequencing Illumina Population genomics Sequence capture 

References

  1. Avise JC. Molecular markers, natural history, and evolution. 2nd ed. Sunderland: Sinauer Associates; 2004.Google Scholar
  2. Backert S, Nielsen BL, Börner T. The mystery of the rings: structure and replication of mitochondrial genomes from higher plants. Trends Plant Sci. 1997;2:477–83.CrossRefGoogle Scholar
  3. Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA, Johnson EA. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One. 2008;3:e3376.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  4. Benjamini Y, Speed TP. Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 2012;40:e72.CrossRefPubMedPubMedCentralGoogle Scholar
  5. Bentley DR, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–9.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  6. Berlin K, Koren S, Chin CS, et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol. 2015;33:623.CrossRefPubMedGoogle Scholar
  7. Birol I, Raymond A, Jackman SD, et al. Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics. 2013;29:1492–7.CrossRefPubMedPubMedCentralGoogle Scholar
  8. Boitard S, Schlotterer C, Nolte V, Pandey RV, Futschik A. Detecting selective sweeps from pooled next-generation sequencing samples. Mol Biol Evol. 2012;29:2177–86.CrossRefPubMedPubMedCentralGoogle Scholar
  9. Browning BL, Browning SR. Genotype imputation with millions of reference samples. Am J Hum Genet. 2016;98:116–26.CrossRefPubMedPubMedCentralGoogle Scholar
  10. Buerkle CA, Gompert Z. Population genomics based on low coverage sequencing: how low should we go? Mol Ecol. 2013;22:3028–35.CrossRefGoogle Scholar
  11. Cervera MT, Storme V, Soto A, Ivens B, Van Montagu M, Rajora OP, Boerjan W. Intraspecific and interspecific genetic and phylogenetic relationships in the genus Populus based on AFLP markers. Theor Appl Genet. 2005;111:1440–56.CrossRefPubMedGoogle Scholar
  12. Chhatre VE, Rajora OP. Genetic divergence and signatures of natural selection in marginal populations of a keystone, long-lived conifer, eastern white pine (Pinus strobus) from northern Ontario. PLoS One. 2014;9:e97291.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  13. Christe C, Stölting KN, Paris M, et al. Adaptive evolution and segregating load contribute to the genomic landscape of divergence in two tree species connected by episodic gene flow. Mol Ecol. 2016;26:59.  https://doi.org/10.1111/mec.13765.CrossRefPubMedGoogle Scholar
  14. De Wit P, Pespeni MH, Ladner JT, et al. The simple fool's guide to population genomics via RNA-Seq: an introduction to high-throughput sequencing data analysis. Mol Ecol Resour. 2012;12:1058–67.CrossRefPubMedGoogle Scholar
  15. Eckert AJ, Pande B, Ersoz ES, et al. High-throughput genotyping and mapping of single nucleotide polymorphisms in loblolly pine (Pinus taeda L.). Tree Genet Genomes. 2009;5:225–34.CrossRefGoogle Scholar
  16. Edelist C, Lexer C, Dillmann C, Sicard D, Rieseberg LH. Microsatellite signature of ecological selection for salt tolerance in a wild sunflower hybrid species, Helianthus paradoxus. Mol Ecol. 2006;15:4623–34.CrossRefPubMedPubMedCentralGoogle Scholar
  17. Elshire RJ, Glaubitz JC, Sun Q, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One. 2011;6:e19379.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  18. Evans J, Kim J, Childs KL, et al. Nucleotide polymorphism and copy number variant detection using exome capture and next-generation sequencing in the polyploid grass Panicum virgatum. Plant J. 2014a;79:993–1008.CrossRefPubMedPubMedCentralGoogle Scholar
  19. Evans LM, Slavov GT, Rodgers-Melnick E, et al. Population genomics of Populus trichocarpa identifies signatures of selection and adaptive trait associations. Nat Genet. 2014b;46:1089–96.CrossRefPubMedGoogle Scholar
  20. Fabian DK, Kapun M, Nolte V, et al. Genome-wide patterns of latitudinal differentiation among populations of Drosophila melanogaster from North America. Mol Ecol. 2012;21:4748–69.CrossRefPubMedPubMedCentralGoogle Scholar
  21. Fageria MS, Rajora OP. Effects of harvesting of increasing intensities on genetic diversity and population structure of white spruce. Evol Appl. 2013;6:778–94.CrossRefGoogle Scholar
  22. Faivre-Rampant P, Zaina G, Jorge V, et al. New resources for genetic studies in Populus nigra: genome-wide SNP discovery and development of a 12k Infinium array. Mol Ecol Resour. 2016;16:1023–36.CrossRefPubMedGoogle Scholar
  23. Fan JB, Oliphant A, Shen R, et al. Highly parallel SNP genotyping. Cold Spring Harb Symp Quant Biol. 2003;68:69–78.CrossRefPubMedGoogle Scholar
  24. Fischer MC, Rellstab C, Tedder A, et al. Population genomic footprints of selection and associations with climate in natural populations of Arabidopsis halleri from the Alps. Mol Ecol. 2013;22:5594–607.CrossRefPubMedPubMedCentralGoogle Scholar
  25. Fullwood MJ, Wei C-L, Liu ET, Ruan Y. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res. 2009;19:521–32.CrossRefPubMedPubMedCentralGoogle Scholar
  26. Gagnaire P-A, Pavey SA, Normandeau E, Bernatchez L. The genetic architecture of reproductive isolation during speciation-with-gene-flow in lake whitefish species pairs assessed by RAD sequencing. Evolution. 2013;67:2483–97.CrossRefPubMedGoogle Scholar
  27. Gnirke A, Melnikov A, Maguire J, et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol. 2009;27:182–9.CrossRefPubMedPubMedCentralGoogle Scholar
  28. Goncalves da Silva A, Barendse W, Kijas JW, et al. SNP discovery in nonmodel organisms: strand bias and base-substitution errors reduce conversion rates. Mol Ecol Resour. 2015;15:723–36.CrossRefPubMedGoogle Scholar
  29. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17:333–51.CrossRefPubMedGoogle Scholar
  30. Hebert FO, Renaut S, Bernatchez L. Targeted sequence capture and resequencing implies a predominant role of regulatory regions in the divergence of a sympatric lake whitefish species pair (Coregonus clupeaformis). Mol Ecol. 2013;22:4896–914.CrossRefPubMedGoogle Scholar
  31. Hodges E, Xuan Z, Balija V, et al. Genome-wide in situ exon capture for selective resequencing. Nat Genet. 2007;39:1522–7.CrossRefPubMedGoogle Scholar
  32. Holliday JA, Ritland K, Aitken SN. Widespread, ecologically relevant genetic markers developed from association mapping of climate-related traits in Sitka spruce (Picea sitchensis). New Phytol. 2010;188:501–14.CrossRefPubMedGoogle Scholar
  33. Holliday JA, Zhou L, Bawa R, Zhang M, Oubida RW. Evidence for extensive parallelism but divergent genomic architecture of adaptation along altitudinal and latitudinal gradients in Populus trichocarpa. New Phytol. 2016;209:1240.  https://doi.org/10.1111/nph.13643.CrossRefPubMedGoogle Scholar
  34. Hou ZG, Jiang P, Swanson SA, et al. A cost-effective RNA sequencing protocol for large-scale gene expression studies. Sci Rep. 2015;5:9570.CrossRefPubMedPubMedCentralGoogle Scholar
  35. Hugot JP, Chamaillard M, Zouali H, Lesage S. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease. Nature. 2001;411:599.ADSCrossRefPubMedGoogle Scholar
  36. Johnston SE, Orell P, Pritchard VL, et al. Genome-wide SNP analysis reveals a genetic basis for sea-age variation in a wild population of Atlantic salmon (Salmo salar). Mol Ecol. 2014;23:3452–68.CrossRefPubMedGoogle Scholar
  37. Jones FC, Grabherr MG, Chan YF, et al. The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484:55–61.CrossRefPubMedPubMedCentralGoogle Scholar
  38. Kang BY, Mann IK, Major JE, Rajora OP. Near-saturated and complete genetic linkage map of black spruce (Picea mariana). BMC Genomics. 2010;24:515.CrossRefGoogle Scholar
  39. Kang BY, Major JE, Rajora OP. A high-density genetic linkage map of a black spruce (Picea mariana) × red spruce (Picea rubens) interspecific hybrid. Genome. 2011;54:128–43.CrossRefPubMedGoogle Scholar
  40. Kiialainen A, Karlberg O, Ahlford A, et al. Performance of microarray and liquid based capture methods for target enrichment for massively parallel sequencing and SNP discovery. PLoS One. 2011;6:e16486.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  41. Kim S, Misra A. SNP genotyping: technologies and biomedical applications. Annu Rev Biomed Eng. 2007;9:289–320.CrossRefPubMedGoogle Scholar
  42. Kofler R, Pandey RV, Schlotterer C. PoPoolation2: identifying differentiation between populations using sequencing of pooled DNA samples (Pool-Seq). Bioinformatics. 2011;27:3435–6.CrossRefPubMedPubMedCentralGoogle Scholar
  43. Kofler R, Betancourt AJ, Schlotterer C. Sequencing of pooled DNA samples (Pool-Seq) uncovers complex dynamics of transposable element insertions in Drosophila melanogaster. PLoS Genet. 2012;8:e1002487.CrossRefPubMedPubMedCentralGoogle Scholar
  44. Kofler R, Gomez-Sanchez D, Schlotterer C. PoPoolationTE2: comparative population genomics of transposable elements using pool-seq. Mol Biol Evol. 2016a;33:2759–64.CrossRefPubMedPubMedCentralGoogle Scholar
  45. Kofler R, Langmuller AM, Nouhaud P, Otte KA, Schlotterer C. Suitability of different mapping algorithms for genome-wide polymorphism scans with pool-seq data. G3 Genes Genomes Genet. 2016b;6:3507–15.Google Scholar
  46. Korbel JO, Urban AE, Affourtit JP, et al. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007;318:420–6.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  47. Kozarewa I, Ning Z, Quail MA, et al. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G plus C)-biased genomes. Nat Methods. 2009;6:291–5.CrossRefPubMedPubMedCentralGoogle Scholar
  48. Kwok PY. Methods for genotyping single nucleotide polymorphisms. Annu Rev Genomics Hum Genet. 2001;2:235–58.CrossRefPubMedGoogle Scholar
  49. Lepoittevin C, Bodenes C, Chancerel E, et al. Single-nucleotide polymorphism discovery and validation in high-density SNP array for genetic analysis in European white oaks. Mol Ecol Resour. 2015;15:1446–59.CrossRefPubMedGoogle Scholar
  50. Lewontin RC, Hubby JT. A molecular approach to the study of genic heterozygosity in natural populations: amount of variation and degree of heterozygosity in natural populations of Drosophila pseudoobscura. Genetics. 1966;54:595–609.PubMedPubMedCentralGoogle Scholar
  51. Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387–406.CrossRefPubMedPubMedCentralGoogle Scholar
  52. Loridon K, Burgarella C, Chantret N, et al. Single-nucleotide polymorphism discovery and diversity in the model legume Medicago truncatula. Mol Ecol Resour. 2013;13:84–95.CrossRefPubMedGoogle Scholar
  53. Malenfant RM, Coltman DW, Davis CS. Design of a 9K illumina beadchip for polar bears (Ursus maritimus) from RAD and transcriptome sequencing. Mol Ecol Resour. 2015;15:587–600.CrossRefPubMedGoogle Scholar
  54. McKernan KJ, Peckham HE, Costa GL, et al. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 2009;19:1527–41.CrossRefPubMedPubMedCentralGoogle Scholar
  55. Mizuki N, Ota M, Kimura M, Ohno S, Ando H, Katsuyama Y, Yamazaki M, Watanabe K, Goto K, Nakamura S, Bahram S. Triplet repeat polymorphism in the transmembrane region of the MICA gene: a strong association of six GCT repetitions with Behcet disease. Proc Natl Acad Sci U S A. 1997;94:1298–303.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  56. Mostovoy Y, Levy-Sakin M, Lam J, et al. A hybrid approach for de novo human genome sequence assembly and phasing. Nat Methods. 2016;13:587.CrossRefPubMedPubMedCentralGoogle Scholar
  57. Nadeau NJ, Whibley A, Jones RT, et al. Genomic islands of divergence in hybridizing Heliconius butterflies identified by large-scale targeted sequencing. Philos Trans R Soc B Biol Sci. 2012;367:343–53.CrossRefGoogle Scholar
  58. Neiman MR, Sundling S, Groenberg H, et al. Library preparation and multiplex capture for massive parallel sequencing applications made efficient and easy. PLoS One. 2012;7:e48616.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  59. Neves LG, Davis JM, Barbazuk WB, Kirst M. Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J. 2013;75:146–56.CrossRefPubMedGoogle Scholar
  60. Okou DT, Steinberg KM, Middle C, et al. Microarray-based genomic selection for high-throughput resequencing. Nat Methods. 2007;4:907–9.CrossRefPubMedGoogle Scholar
  61. Pascoal S, Cezard T, Eik-Nes A, et al. Rapid convergent evolution in wild crickets. Curr Biol. 2014;24:1369–74.CrossRefPubMedGoogle Scholar
  62. Pavy N, Pelgas B, Beauseigle S, et al. Enhancing genetic mapping of complex genomes through the design of highly-multiplexed SNP arrays: application to the large and unsequenced genomes of white spruce and black spruce. BMC Genomics. 2008;9:21.CrossRefPubMedPubMedCentralGoogle Scholar
  63. Pavy N, Gagnon F, Rigault P, et al. Development of high-density SNP genotyping arrays for white spruce (Picea glauca) and transferability to subtropical and nordic congeners. Mol Ecol Resour. 2013;13:324–36.CrossRefPubMedGoogle Scholar
  64. Pavy N, Gagnon F, Deschenes A, et al. Development of highly reliable in silico SNP resource and genotyping assay from exome capture and sequencing: an example from black spruce (Picea mariana). Mol Ecol Resour. 2016;16:588–98.CrossRefPubMedGoogle Scholar
  65. Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE. Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One. 2012;7:e37135.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  66. Picelli S, Bjorklund AK, Reinius B, et al. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 2014;24:2033–40.CrossRefPubMedPubMedCentralGoogle Scholar
  67. Plomion C, Bartholome J, Lesur I, et al. High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster). Mol Ecol Resour. 2016;16:574–87.CrossRefPubMedGoogle Scholar
  68. Quick J, Loman NJ, Duraffour S, et al. Real-time, portable genome sequencing for Ebola surveillance. Nature. 2016;530:228.ADSCrossRefPubMedPubMedCentralGoogle Scholar
  69. Rajora OP, Rahman MH, Buchert GP, Dancik BP. Microsatellite DNA analysis of genetic effects of harvesting in old-growth eastern white pine (Pinus strobus) in Ontario, Canada. Mol Ecol. 2000;9:339–48.CrossRefPubMedGoogle Scholar
  70. Remington DL, Whetten RW, Liu BH, O’Malley DM. Construction of an AFLP genetic map with nearly complete genome coverage in Pinus taeda. Theor Appl Genet. 1999;98:1279–92.CrossRefPubMedGoogle Scholar
  71. Reuter JA, Spacek DV, Snyder MP. High-throughput sequencing technologies. Mol Cell. 2015;58:586–97.CrossRefPubMedPubMedCentralGoogle Scholar
  72. Rheindt FE, Fujita MK, Wilton PR, Edwards SV. Introgression and phenotypic assimilation in Zimmerius flycatchers (Tyrannidae): population genetic and phylogenetic inferences from genome-wide SNPs. Syst Biol. 2014;63:134–52.CrossRefPubMedGoogle Scholar
  73. Ronaghi M, Karamohamed S, Pettersson B, Uhlen M, Nyren P. Real-time DNA sequencing using detection of pyrophosphate release. Anal Biochem. 1996;242:84–9.CrossRefPubMedGoogle Scholar
  74. Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science. 1998;281:363.CrossRefPubMedGoogle Scholar
  75. Scheet P, Stephens M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006;78:629–44.CrossRefPubMedPubMedCentralGoogle Scholar
  76. Schlotterer C, Tobler R, Kofler R, Nolte V. Sequencing pools of individuals-mining genome-wide polymorphism data without big funding. Nat Rev Genet. 2014;15:749–63.CrossRefPubMedGoogle Scholar
  77. Shen R, Fan JB, Campbell D, et al. High-throughput SNP genotyping on universal bead arrays. Mutat Res Fundam Mol Mech Mutagen. 2005;573:70–82.CrossRefGoogle Scholar
  78. Shendure J, Balasubramanian S, Church GM, et al. DNA sequencing at 40: past, present and future. Nature. 2017;550:345.  https://doi.org/10.1038/nature24286. Advance Online Publication.ADSCrossRefPubMedGoogle Scholar
  79. Sobel JM, Streisfeld MA. Strong premating reproductive isolation drives incipient speciation in Mimulus aurantiacus. Evolution. 2015;69:447–61.CrossRefPubMedGoogle Scholar
  80. Soria-Carrasco V, Gompert Z, Comeault AA, et al. Stick insect genomes reveal natural Selection’s role in parallel speciation. Science. 2014;344:738–42.ADSCrossRefPubMedGoogle Scholar
  81. Straub SCK, Parks M, Weitemier K, et al. Navigating the tip of the genomic icebeg: next-generation sequencing for plant systematics. Am J Bot. 2012;99:349–64.CrossRefPubMedGoogle Scholar
  82. Suren H, Hodgins KA, Yeaman S, et al. Exome capture from the spruce and pine giga-genomes. Mol Ecol Resour. 2016;16:1136–46.CrossRefPubMedGoogle Scholar
  83. Travers KJ, Chin CS, Rank DR, Eid JS, Turner SW. A flexible and efficient template format for circular consensus sequencing and SNP detection. Nucleic Acids Res. 2010;38:e159.CrossRefPubMedPubMedCentralGoogle Scholar
  84. Travis SE, Ritland K, Whitham TG, Keim P. A genetic linkage map of Pinyon pine (Pinus edulis) based on amplified fragment length polymorphisms. Theor Appl Genet. 1998;97:871–80.CrossRefGoogle Scholar
  85. Tuskan GA, DiFazio S, Jansson S, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006;313:1596–604.ADSCrossRefPubMedGoogle Scholar
  86. Wang DG, Fan JB, Siao CJ, et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998;280:1077–82.ADSCrossRefPubMedGoogle Scholar
  87. Yanez JM, Naswa S, Lopez ME, et al. Genomewide single nucleotide polymorphism discovery in Atlantic salmon (Salmo salar): validation in wild and farmed American and European populations. Mol Ecol Resour. 2016;16:1002–11.CrossRefPubMedGoogle Scholar
  88. Zhou L, Holliday JA. Targeted enrichment of the black cottonwood (Populus trichocarpa) gene space using sequence capture. BMC Genomics. 2012;13:703.CrossRefPubMedPubMedCentralGoogle Scholar
  89. Zhou L, Bawa R, Holliday JA. Exome resequencing reveals signatures of demographic and adaptive processes across the genome and range of black cottonwood (Populus trichocarpa). Mol Ecol. 2014;23:2486–99.CrossRefPubMedGoogle Scholar
  90. Zurawski G, Clegg MT. Evolution of higher-plant chloroplast DNA-encoded genes: implications for structure-function and phylogenetic studies. Annu Rev Plant Physiol. 1987;38:391–418.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • J. A. Holliday
    • 1
  • E. M. Hallerman
    • 2
  • D. C. Haak
    • 3
  1. 1.Department of Forest Resources and Environmental ConservationVirginia TechBlacksburgUSA
  2. 2.Department of Fish and Wildlife ConservationVirginia TechBlacksburgUSA
  3. 3.Department of Plant Pathology, Physiology, and Weed ScienceVirginia TechBlacksburgUSA

Personalised recommendations