Genotyping and Statistical Analysis

Lysenko, Artem; Boroevich, Keith A.; Tsunoda, Tatsuhiko

doi:10.1007/978-981-13-8177-5_1

Artem Lysenko⁴,
Keith A. Boroevich⁴ &
Tatsuhiko Tsunoda^4,5,6

1221 Accesses

Abstract

Development of technologies for high-throughput profiling of DNA variation has led to rapid discovery of causal genetic mutations underlying complex phenotypic traits and diseases. These exciting advances were originally enabled by the results from the Human Genome project (1990–2003) that allowed the completion of the first genome-wide association study in 2002 and led to the development of haplotype maps of the human genome. Technological advances in microarray genotyping and next-generation sequencing have since made possible the wide-spread and cost-effective application of this approach and, in combination, have powered the new age of biomedical discovery. This chapter introduces the history and fundamental principles of genetic association analysis, and explains key concepts and current statistical methods for processing these data. In particular, discussed topics include experimental design of association studies, quality control procedures, approaches for dealing with the population stratification, statistical testing for genetic associations and more recent developments in detection of effects of rare variants and genetic interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Morgan TH (1911) Random segregation versus coupling in Mendelian inheritance. Science 34:384
Article CAS PubMed Google Scholar
Sturtevant AH (1913) The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association. J Exp Zool A Ecol Genet Physiol 14:43
Google Scholar
Lyamichev V et al (1999) Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol 17:292
Article CAS PubMed Google Scholar
Haga H, Yamada R, Ohnishi Y, Nakamura Y, Tanaka T (2002) Gene-based SNP discovery as part of the Japanese Millennium Genome Project: identification of 190 562 genetic variations in the human genome. J Hum Genet 47:605
Article CAS PubMed Google Scholar
Ozaki K et al (2002) Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction. Nat Genet 32:650
Article CAS PubMed Google Scholar
Tsunoda T et al (2004) Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome. Hum Mol Genet 13:1623
Article CAS PubMed Google Scholar
Dearlove AM (2002) High throughput genotyping technologies. Brief Funct Genomic Proteomic 1:139
Article CAS PubMed Google Scholar
Gabriel S, Ziaugra L (2004) SNP genotyping using Sequenom MassARRAY 7K platform. In: Current protocols in human genetics. Chapter 2, Unit 2 12
Google Scholar
Chen X, Levine L, Kwok PY (1999) Fluorescence polarization in homogeneous nucleic acid analysis. Genome Res 9:492
CAS PubMed PubMed Central Google Scholar
Hsu TM, Chen X, Duan S, Miller RD, Kwok PY (2001) Universal SNP genotyping assay with fluorescence polarization detection. BioTechniques 31:560
Article CAS PubMed Google Scholar
Kwok PY (2002) SNP genotyping with fluorescence polarization detection. Hum Mutat 19:315
Article CAS PubMed Google Scholar
Holland PM, Abramson RD, Watson R, Gelfand DH (1991) Detection of specific polymerase chain reaction product by utilizing the 5′–3′ exonuclease activity of Thermus aquaticus DNA polymerase. Proc Natl Acad Sci U S A 88:7276
Article CAS PubMed PubMed Central Google Scholar
Olivier M (2005) The invader assay for SNP genotyping. Mutat Res 573:103
Article CAS PubMed PubMed Central Google Scholar
Mast A, de Arruda M (2006) Invader assay for single-nucleotide polymorphism genotyping and gene copy number evaluation. Methods Mol Biol 335:173
CAS PubMed Google Scholar
Bumgarner R (2013) Overview of DNA microarrays: types, applications, and their future. In: Current protocols in molecular biology. Chapter 22, Unit 22 1
Google Scholar
Dalma-Weiszhausz DD, Warrington J, Tanimoto EY, Miyada CG (2006) The affymetrix GeneChip platform: an overview. Methods Enzymol 410:3
Article CAS PubMed Google Scholar
Hardenbol P et al (2003) Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol 21:673
Article CAS PubMed Google Scholar
Nilsson M et al (1994) Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 265:2085
Article CAS PubMed Google Scholar
International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851
Article CAS Google Scholar
International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299
Article CAS Google Scholar
Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661
Article CAS Google Scholar
1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061
Article CAS Google Scholar
Voight BF et al (2012) The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet 8:e1002793
Article CAS PubMed PubMed Central Google Scholar
Cortes A, Brown MA (2011) Promise and pitfalls of the Immunochip. Arthritis Res Ther 13:101
Article PubMed PubMed Central Google Scholar
Igo RP Jr, Cooke Bailey JN, Romm J, Haines JL, Wiggs JL (2016) Quality control for the illumina HumanExome BeadChip. Curr Protocols Hum Genet 90:2.14.1
Google Scholar
Walter K et al (2015) The UK10K project identifies rare variants in health and disease. Nature 526:82
Article CAS PubMed Google Scholar
Yamaguchi-Kabata Y et al (2015) iJGVD: an integrative Japanese genome variation database based on whole-genome sequencing. Hum Genome Var 2:15050
Article CAS PubMed PubMed Central Google Scholar
Hong EP, Park JW (2012) Sample size and statistical power calculation in genetic association studies. Genomics Inform 10:117
Article PubMed PubMed Central Google Scholar
Yang J et al (2012) Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 44:369
Article CAS PubMed PubMed Central Google Scholar
Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443
Article CAS PubMed PubMed Central Google Scholar
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851
Article CAS PubMed PubMed Central Google Scholar
Martin ER et al (2010) SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies. Bioinformatics 26:2803
Article CAS PubMed PubMed Central Google Scholar
Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629
Article CAS PubMed PubMed Central Google Scholar
Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816
Article PubMed PubMed Central Google Scholar
Yang X, Chockalingam SP, Aluru S (2012) A survey of error-correction methods for next-generation sequencing. Brief Bioinform 14:56
Article CAS PubMed Google Scholar
Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81:1084
Article CAS PubMed PubMed Central Google Scholar
Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978
Article CAS PubMed PubMed Central Google Scholar
Cardon LR, Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361:598
Article PubMed Google Scholar
Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997
Article CAS PubMed Google Scholar
Yang BZ, Zhao H, Kranzler HR, Gelernter J (2005) Practical population group assignment with selected informative markers: characteristics and properties of Bayesian clustering via STRUCTURE. Genet Epidemiol 28:302
Article PubMed Google Scholar
Price AL et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904
Article CAS PubMed Google Scholar
Sham PC, Purcell SM (2014) Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 15:335
Article CAS PubMed Google Scholar
Gao X, Becker LC, Becker DM, Starmer JD, Province MA (2010) Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol 34:100
PubMed PubMed Central Google Scholar
Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461:747
Article CAS PubMed PubMed Central Google Scholar
Lee S, Wu MC, Lin X (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762
Article PubMed PubMed Central Google Scholar
Cordell HJ (2002) Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Artem Lysenko, Keith A. Boroevich & Tatsuhiko Tsunoda
Tsunoda Laboratory (Medical Science Mathematics), Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
Tatsuhiko Tsunoda
Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
Tatsuhiko Tsunoda

Authors

Artem Lysenko
View author publications
You can also search for this author in PubMed Google Scholar
Keith A. Boroevich
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuhiko Tsunoda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tatsuhiko Tsunoda .

Editor information

Editors and Affiliations

Graduate School of Science, The University of Tokyo, Tokyo, Japan
Tatsuhiko Tsunoda
BioResource Research Center, Tokyo Medical and Dental University, Tokyo, Japan
Toshihiro Tanaka
Cancer Precision Medicine Center, Japanese Foundation for Cancer Research, Tokyo, Japan
Yusuke Nakamura

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lysenko, A., Boroevich, K.A., Tsunoda, T. (2019). Genotyping and Statistical Analysis. In: Tsunoda, T., Tanaka, T., Nakamura, Y. (eds) Genome-Wide Association Studies. Springer, Singapore. https://doi.org/10.1007/978-981-13-8177-5_1

Download citation

DOI: https://doi.org/10.1007/978-981-13-8177-5_1
Published: 01 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8176-8
Online ISBN: 978-981-13-8177-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics