Abstract
Development of technologies for high-throughput profiling of DNA variation has led to rapid discovery of causal genetic mutations underlying complex phenotypic traits and diseases. These exciting advances were originally enabled by the results from the Human Genome project (1990–2003) that allowed the completion of the first genome-wide association study in 2002 and led to the development of haplotype maps of the human genome. Technological advances in microarray genotyping and next-generation sequencing have since made possible the wide-spread and cost-effective application of this approach and, in combination, have powered the new age of biomedical discovery. This chapter introduces the history and fundamental principles of genetic association analysis, and explains key concepts and current statistical methods for processing these data. In particular, discussed topics include experimental design of association studies, quality control procedures, approaches for dealing with the population stratification, statistical testing for genetic associations and more recent developments in detection of effects of rare variants and genetic interactions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Morgan TH (1911) Random segregation versus coupling in Mendelian inheritance. Science 34:384
Sturtevant AH (1913) The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association. J Exp Zool A Ecol Genet Physiol 14:43
Lyamichev V et al (1999) Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol 17:292
Haga H, Yamada R, Ohnishi Y, Nakamura Y, Tanaka T (2002) Gene-based SNP discovery as part of the Japanese Millennium Genome Project: identification of 190 562 genetic variations in the human genome. J Hum Genet 47:605
Ozaki K et al (2002) Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction. Nat Genet 32:650
Tsunoda T et al (2004) Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome. Hum Mol Genet 13:1623
Dearlove AM (2002) High throughput genotyping technologies. Brief Funct Genomic Proteomic 1:139
Gabriel S, Ziaugra L (2004) SNP genotyping using Sequenom MassARRAY 7K platform. In: Current protocols in human genetics. Chapter 2, Unit 2 12
Chen X, Levine L, Kwok PY (1999) Fluorescence polarization in homogeneous nucleic acid analysis. Genome Res 9:492
Hsu TM, Chen X, Duan S, Miller RD, Kwok PY (2001) Universal SNP genotyping assay with fluorescence polarization detection. BioTechniques 31:560
Kwok PY (2002) SNP genotyping with fluorescence polarization detection. Hum Mutat 19:315
Holland PM, Abramson RD, Watson R, Gelfand DH (1991) Detection of specific polymerase chain reaction product by utilizing the 5′–3′ exonuclease activity of Thermus aquaticus DNA polymerase. Proc Natl Acad Sci U S A 88:7276
Olivier M (2005) The invader assay for SNP genotyping. Mutat Res 573:103
Mast A, de Arruda M (2006) Invader assay for single-nucleotide polymorphism genotyping and gene copy number evaluation. Methods Mol Biol 335:173
Bumgarner R (2013) Overview of DNA microarrays: types, applications, and their future. In: Current protocols in molecular biology. Chapter 22, Unit 22 1
Dalma-Weiszhausz DD, Warrington J, Tanimoto EY, Miyada CG (2006) The affymetrix GeneChip platform: an overview. Methods Enzymol 410:3
Hardenbol P et al (2003) Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol 21:673
Nilsson M et al (1994) Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 265:2085
International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851
International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299
Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661
1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061
Voight BF et al (2012) The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet 8:e1002793
Cortes A, Brown MA (2011) Promise and pitfalls of the Immunochip. Arthritis Res Ther 13:101
Igo RP Jr, Cooke Bailey JN, Romm J, Haines JL, Wiggs JL (2016) Quality control for the illumina HumanExome BeadChip. Curr Protocols Hum Genet 90:2.14.1
Walter K et al (2015) The UK10K project identifies rare variants in health and disease. Nature 526:82
Yamaguchi-Kabata Y et al (2015) iJGVD: an integrative Japanese genome variation database based on whole-genome sequencing. Hum Genome Var 2:15050
Hong EP, Park JW (2012) Sample size and statistical power calculation in genetic association studies. Genomics Inform 10:117
Yang J et al (2012) Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 44:369
Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851
Martin ER et al (2010) SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies. Bioinformatics 26:2803
Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629
Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816
Yang X, Chockalingam SP, Aluru S (2012) A survey of error-correction methods for next-generation sequencing. Brief Bioinform 14:56
Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81:1084
Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978
Cardon LR, Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361:598
Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997
Yang BZ, Zhao H, Kranzler HR, Gelernter J (2005) Practical population group assignment with selected informative markers: characteristics and properties of Bayesian clustering via STRUCTURE. Genet Epidemiol 28:302
Price AL et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904
Sham PC, Purcell SM (2014) Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 15:335
Gao X, Becker LC, Becker DM, Starmer JD, Province MA (2010) Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol 34:100
Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461:747
Lee S, Wu MC, Lin X (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762
Cordell HJ (2002) Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Lysenko, A., Boroevich, K.A., Tsunoda, T. (2019). Genotyping and Statistical Analysis. In: Tsunoda, T., Tanaka, T., Nakamura, Y. (eds) Genome-Wide Association Studies. Springer, Singapore. https://doi.org/10.1007/978-981-13-8177-5_1
Download citation
DOI: https://doi.org/10.1007/978-981-13-8177-5_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8176-8
Online ISBN: 978-981-13-8177-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)