Skip to main content

Genotyping and Statistical Analysis

  • Chapter
  • First Online:
Genome-Wide Association Studies

Abstract

Development of technologies for high-throughput profiling of DNA variation has led to rapid discovery of causal genetic mutations underlying complex phenotypic traits and diseases. These exciting advances were originally enabled by the results from the Human Genome project (1990–2003) that allowed the completion of the first genome-wide association study in 2002 and led to the development of haplotype maps of the human genome. Technological advances in microarray genotyping and next-generation sequencing have since made possible the wide-spread and cost-effective application of this approach and, in combination, have powered the new age of biomedical discovery. This chapter introduces the history and fundamental principles of genetic association analysis, and explains key concepts and current statistical methods for processing these data. In particular, discussed topics include experimental design of association studies, quality control procedures, approaches for dealing with the population stratification, statistical testing for genetic associations and more recent developments in detection of effects of rare variants and genetic interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.ahc.umn.edu/img/assets/19726/Multiplexing_hME_App_Note.pdf.

  2. 2.

    https://www.illumina.com/science/technology/beadarray-technology.html.

  3. 3.

    http://barleyworld.org/sites/barleyworld.org/files/illuminasnpgenotyping.pdf.

References

  1. Morgan TH (1911) Random segregation versus coupling in Mendelian inheritance. Science 34:384

    Article  CAS  PubMed  Google Scholar 

  2. Sturtevant AH (1913) The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association. J Exp Zool A Ecol Genet Physiol 14:43

    Google Scholar 

  3. Lyamichev V et al (1999) Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol 17:292

    Article  CAS  PubMed  Google Scholar 

  4. Haga H, Yamada R, Ohnishi Y, Nakamura Y, Tanaka T (2002) Gene-based SNP discovery as part of the Japanese Millennium Genome Project: identification of 190 562 genetic variations in the human genome. J Hum Genet 47:605

    Article  CAS  PubMed  Google Scholar 

  5. Ozaki K et al (2002) Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction. Nat Genet 32:650

    Article  CAS  PubMed  Google Scholar 

  6. Tsunoda T et al (2004) Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome. Hum Mol Genet 13:1623

    Article  CAS  PubMed  Google Scholar 

  7. Dearlove AM (2002) High throughput genotyping technologies. Brief Funct Genomic Proteomic 1:139

    Article  CAS  PubMed  Google Scholar 

  8. Gabriel S, Ziaugra L (2004) SNP genotyping using Sequenom MassARRAY 7K platform. In: Current protocols in human genetics. Chapter 2, Unit 2 12

    Google Scholar 

  9. Chen X, Levine L, Kwok PY (1999) Fluorescence polarization in homogeneous nucleic acid analysis. Genome Res 9:492

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Hsu TM, Chen X, Duan S, Miller RD, Kwok PY (2001) Universal SNP genotyping assay with fluorescence polarization detection. BioTechniques 31:560

    Article  CAS  PubMed  Google Scholar 

  11. Kwok PY (2002) SNP genotyping with fluorescence polarization detection. Hum Mutat 19:315

    Article  CAS  PubMed  Google Scholar 

  12. Holland PM, Abramson RD, Watson R, Gelfand DH (1991) Detection of specific polymerase chain reaction product by utilizing the 5′–3′ exonuclease activity of Thermus aquaticus DNA polymerase. Proc Natl Acad Sci U S A 88:7276

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Olivier M (2005) The invader assay for SNP genotyping. Mutat Res 573:103

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Mast A, de Arruda M (2006) Invader assay for single-nucleotide polymorphism genotyping and gene copy number evaluation. Methods Mol Biol 335:173

    CAS  PubMed  Google Scholar 

  15. Bumgarner R (2013) Overview of DNA microarrays: types, applications, and their future. In: Current protocols in molecular biology. Chapter 22, Unit 22 1

    Google Scholar 

  16. Dalma-Weiszhausz DD, Warrington J, Tanimoto EY, Miyada CG (2006) The affymetrix GeneChip platform: an overview. Methods Enzymol 410:3

    Article  CAS  PubMed  Google Scholar 

  17. Hardenbol P et al (2003) Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol 21:673

    Article  CAS  PubMed  Google Scholar 

  18. Nilsson M et al (1994) Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 265:2085

    Article  CAS  PubMed  Google Scholar 

  19. International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851

    Article  CAS  Google Scholar 

  20. International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299

    Article  CAS  Google Scholar 

  21. Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661

    Article  CAS  Google Scholar 

  22. 1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061

    Article  CAS  Google Scholar 

  23. Voight BF et al (2012) The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet 8:e1002793

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Cortes A, Brown MA (2011) Promise and pitfalls of the Immunochip. Arthritis Res Ther 13:101

    Article  PubMed  PubMed Central  Google Scholar 

  25. Igo RP Jr, Cooke Bailey JN, Romm J, Haines JL, Wiggs JL (2016) Quality control for the illumina HumanExome BeadChip. Curr Protocols Hum Genet 90:2.14.1

    Google Scholar 

  26. Walter K et al (2015) The UK10K project identifies rare variants in health and disease. Nature 526:82

    Article  CAS  PubMed  Google Scholar 

  27. Yamaguchi-Kabata Y et al (2015) iJGVD: an integrative Japanese genome variation database based on whole-genome sequencing. Hum Genome Var 2:15050

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hong EP, Park JW (2012) Sample size and statistical power calculation in genetic association studies. Genomics Inform 10:117

    Article  PubMed  PubMed Central  Google Scholar 

  29. Yang J et al (2012) Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 44:369

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Martin ER et al (2010) SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies. Bioinformatics 26:2803

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816

    Article  PubMed  PubMed Central  Google Scholar 

  35. Yang X, Chockalingam SP, Aluru S (2012) A survey of error-correction methods for next-generation sequencing. Brief Bioinform 14:56

    Article  CAS  PubMed  Google Scholar 

  36. Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81:1084

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Cardon LR, Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361:598

    Article  PubMed  Google Scholar 

  39. Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997

    Article  CAS  PubMed  Google Scholar 

  40. Yang BZ, Zhao H, Kranzler HR, Gelernter J (2005) Practical population group assignment with selected informative markers: characteristics and properties of Bayesian clustering via STRUCTURE. Genet Epidemiol 28:302

    Article  PubMed  Google Scholar 

  41. Price AL et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904

    Article  CAS  PubMed  Google Scholar 

  42. Sham PC, Purcell SM (2014) Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 15:335

    Article  CAS  PubMed  Google Scholar 

  43. Gao X, Becker LC, Becker DM, Starmer JD, Province MA (2010) Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol 34:100

    PubMed  PubMed Central  Google Scholar 

  44. Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461:747

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Lee S, Wu MC, Lin X (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762

    Article  PubMed  PubMed Central  Google Scholar 

  46. Cordell HJ (2002) Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatsuhiko Tsunoda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lysenko, A., Boroevich, K.A., Tsunoda, T. (2019). Genotyping and Statistical Analysis. In: Tsunoda, T., Tanaka, T., Nakamura, Y. (eds) Genome-Wide Association Studies. Springer, Singapore. https://doi.org/10.1007/978-981-13-8177-5_1

Download citation

Publish with us

Policies and ethics