Skip to main content

Part of the book series: Statistics for Biology and Health ((SBH))

  • 3293 Accesses

Abstract

This chapter provides an elementary introduction to some of the basic biology and technology that underlies genetic association studies that rely on dense genotyping of nominally unrelated individuals to discover genetic variants related to risk of disease and other outcomes, phenotypes, or traits. This chapter discusses relevant aspects of DNA and RNA architecture, coding of amino acids, describes chromosomal organization, gives an overview of the most common types of sequence variation, and provides an overview of genotyping methods. It introduces concepts, databases, analysis programs, and example data that will be used in later portions of the book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    These are DNA codons; RNA codons (translated from DNA for the purpose of interaction with the ribosome in protein formation) substitute thymine (T) with uracil U since T is replaced by U in RNA molecules.

References

  1. Watson, J. D., & Crick, F. H. (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 171, 737–738.

    Article  Google Scholar 

  2. Watson, J. D., & Crick, F. H. (1953). Genetical implications of the structure of deoxyribonucleic acid. Nature, 171, 964–967.

    Article  Google Scholar 

  3. Darwin, C. (1859). On the origin of species by means of natural selection, or, the preservation of favoured races in the struggle for life. London: John Murray.

    Google Scholar 

  4. Muller, H. J. (1927). Artificial transmutation of the gene. Science, 66, 84–87.

    Article  Google Scholar 

  5. Beadle, G. W., & Tatum, E. L. (1941). Genetic control of biochemical reactions in Neurospora. Proceedings of the National Academy of Sciences USA, 27, 499–506.

    Article  Google Scholar 

  6. Kimura, M., & Crow, J. F. (1964). The number of alleles that can be maintained in a finite population. Genetics, 49, 725–738.

    Google Scholar 

  7. Kimura, M. (1983). Rare variant alleles in the light of the neutral theory. Molecular Biology and Evolution, 1, 84–93.

    Google Scholar 

  8. Baker, B. S., Carpenter, A. T., Esposito, M. S., Esposito, R. E., & Sandler, L. (1976). The genetic control of meiosis. Annual Review of Genetics, 10, 53–134.

    Article  Google Scholar 

  9. Kabak, D. B. (1996). Chromosome-size dependent control of meiotic recombination in humans. Nature Genetics, 13, 20–21.

    Article  Google Scholar 

  10. KabackD, B., Guacci, V., Barber, D., & Mahon, J. W. (1992). Chromosome size-dependent control of meiotic recombination. Science, 256, 228–232.

    Article  Google Scholar 

  11. Okamoto, I., Otte, A. P., Allis, C. D., Reinberg, D., & Heard, E. (2004). Epigenetic dynamics of imprinted X inactivation during early mouse development. Science, 303, 644–649.

    Article  Google Scholar 

  12. Ecker, J. R., Bickmore, W. A., Barroso, I., Pritchard, J. K., Gilad, Y., & Segal, E. (2012). Genomics: ENCODE explained. Nature, 489, 52–55.

    Article  Google Scholar 

  13. Abecasis, G. R., Auton, A., Brooks, L. D., DePristo, M. A., Durbin, R. M., Handsaker, R. E., et al. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491, 56–65.

    Article  Google Scholar 

  14. King, M. C., Marks, J. H., & Mandell, J. B. (2003). Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2. Science, 302, 643–646.

    Article  Google Scholar 

  15. Kumar, P., Henikoff, S., & Ng, P. C. (2009). Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols, 4, 1073–1081.

    Article  Google Scholar 

  16. Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., et al. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7, 248–249.

    Article  Google Scholar 

  17. Nelson, S. (2011, March 25). UW Genetics Coordinating Center: the + and – of DNA strand issues, p 21. University of Washington, Seattle, WA

    Google Scholar 

  18. Zhang, Z., & Gerstein, M. (2003). Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Research, 31, 5338–5348.

    Article  Google Scholar 

  19. Iafrate, A. J., Feuk, L., Rivera, M. N., Listewnik, M. L., Donahoe, P. K., Qi, Y., et al. (2004). Detection of large-scale variation in the human genome. Nature Genetics, 36, 949–951.

    Article  Google Scholar 

  20. Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., et al. (2004). Large-scale copy number polymorphism in the human genome. Science, 305, 525–528.

    Article  Google Scholar 

  21. Feuk, L., Carson, A. R., & Scherer, S. W. (2006). Structural variation in the human genome. Nature Reviews. Genetics, 7, 85–97.

    Article  Google Scholar 

  22. Bunin, G. R., Needle, M., & Riccardi, V. M. (1997). Paternal age and sporadic neurofibromatosis 1: a case-control study and consideration of the methodologic issues. Genetic Epidemiology, 14, 507–516.

    Article  Google Scholar 

  23. Haiman, C. A., Han, Y., Feng, Y., Xia, L., Hsu, C., Sheng, X., et al. (2013). Genome-wide testing of putative functional exonic variants in relationship with breast and prostate cancer risk in a multiethnic population. PLoS Genetics, 9, e1003419.

    Article  Google Scholar 

  24. Nackley, A. G., Shabalina, S. A., Tchivileva, I. E., Satterfield, K., Korchynskyi, O., Makarov, S. S., et al. (2006). Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science, 314, 1930–1933.

    Article  Google Scholar 

  25. Tamiya, G., Shinya, M., Imanishi, T., Ikuta, T., Makino, S., Okamoto, K., et al. (2005). Whole genome association study of rheumatoid arthritis using 27 039 microsatellites. Human Molecular Genetics, 14, 2305–2321.

    Article  Google Scholar 

  26. Taylor, R. W., & Turnbull, D. M. (2005). Mitochondrial DNA mutations in human disease. Nature Reviews. Genetics, 6, 389–402.

    Article  Google Scholar 

  27. Holt, I. J., Harding, A. E., & Morgan-Hughes, J. A. (1988). Deletions of muscle mitochondrial DNA in patients with mitochondrial myopathies. Nature, 331, 717–719.

    Article  Google Scholar 

  28. Illumina. (2009). Improved cluster generation with Gentrain2, Illumina Inc, San Diego

    Google Scholar 

  29. Schillert, A., & Ziegler, A. (2012). Genotype calling for the Affymetrix platform. Methods in Molecular Biology, 850, 513–523.

    Article  Google Scholar 

  30. Korn, J. M., Kuruvilla, F. G., McCarroll, S. A., Wysoker, A., Nemesh, J., Cawley, S., et al. (2008). Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nature Genetics, 40, 1253–1260.

    Article  Google Scholar 

  31. Giannoulatou, E., Yau, C., Colella, S., Ragoussis, J., & Holmes, C. C. (2008). GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population. Bioinformatics, 24, 2209–2214.

    Article  Google Scholar 

  32. Carvalho, B., Bengtsson, H., Speed, T. P., & Irizarry, R. A. (2007). Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics, 8, 485–499.

    Article  MATH  Google Scholar 

  33. Browning, B. L., & Yu, Z. (2009). Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. American Journal of Human Genetics, 85, 847–861.

    Article  Google Scholar 

  34. AFFYMETRIX. (2006). BRLMM: an improved genotype calling method for the GeneChip Human Mapping 500K Array Set. Santa Clara, CA: Affymetrix.

    Google Scholar 

  35. Li, G., Gelernter, J., Kranzler, H. R., & Zhao, H. (2012). M(3): an improved SNP calling algorithm for Illumina BeadArray data. Bioinformatics, 28, 358–365.

    Article  Google Scholar 

  36. Eeles, R. A., Olama, A. A. A., Benlloch, S., Saunders, E. J., Leongamornlert, D. A., Tymrakiewicz, M., Ghoussaini, M., et al. (2013) Identification of 23 novel prostate cancer susceptibility loci using a custom array (the iCOGS) in an international consortium, PRACTICAL. Nature Genetics, 45, 385–391.

    Article  Google Scholar 

  37. Wu, Y., Waite, L. L., Jackson, A. U., Sheu, W. H. H., Buyske, S., Absher, D., et al. (2013). Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS Genetics, 9, e1003379.

    Article  Google Scholar 

  38. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81, 559–575.

    Article  Google Scholar 

  39. Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., & Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38, 904–909.

    Article  Google Scholar 

  40. Li, Y., Willer, C. J., Ding, J., Scheet, P., & Abecasis, G. R. (2010). MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology, 34, 816–834.

    Article  Google Scholar 

  41. Delaneau, O., Marchini, J., & Zagury, J.-F. (2011). A linear complexity phasing method for thousands of genomes. Nature Methods, 9, 179–181.

    Article  Google Scholar 

  42. Browning, B. L., & Browning, S. R. (2009). A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics, 84, 210–223.

    Article  Google Scholar 

  43. Howie, B. N., Donnelly, P., & Marchini, J. (2009). Impute2: a flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics, 5, e1000529.

    Article  Google Scholar 

  44. Gauderman, W., & Morrison, J. (2006). QUANTO 1.1: a computer program for power and sample size calculations for genetic-epidemiology studies. http://hydra.usc.edu/gxe

  45. Kang, H. M., Sul, J. H., Service, S. K., Zaitlen, N. A., Kong, S. Y., Freimer, N. B., et al. (2010). Variance component model to account for sample structure in genome-wide association studies. Nature Genetics, 42, 348–354.

    Article  Google Scholar 

  46. Yang, J., Lee, S. H., Goddard, M. E., & Visscher, P. M. (2011). GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics, 88, 76–82.

    Article  Google Scholar 

  47. Fu, W., O’Connor, T. D., Jun, G., Kang, H. M., Abecasis, G., Leal, S. M., et al. (2013). Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature, 493, 216–220.

    Article  Google Scholar 

  48. Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S., et al. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences USA, 106, 9362–9367.

    Article  Google Scholar 

  49. Coetzee, S. G., Rhie, S. K., Berman, B. P., Coetzee, G. A., & Noushmehr, H. (2012). FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs. Nucleic Acids Research, 40, e139.

    Article  Google Scholar 

  50. GTEx Consortium. (2013). The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45, 580–585.

    Article  Google Scholar 

  51. Yang, T. P., Beazley, C., Montgomery, S. B., Dimas, A. S., Gutierrez-Arcelus, M., Stranger, B. E., et al. (2010). Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics, 26, 2474–2476.

    Article  Google Scholar 

  52. Nica, A. C., Parts, L., Glass, D., Nisbet, J., Barrett, A., Sekowska, M., et al. (2011). The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genetics, 7, e1002003.

    Article  Google Scholar 

  53. Grundberg, E., Small, K. S., Hedman, A. K., Nica, A. C., Buil, A., Keildson, S., et al. (2012). Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nature Genetics, 44, 1084–1089.

    Article  Google Scholar 

  54. Stranger, B. E., Montgomery, S. B., Dimas, A. S., Parts, L., Stegle, O., Ingle, C. E., et al. (2012). Patterns of cis regulatory variation in diverse human populations. PLoS Genetics, 8, e1002639.

    Article  Google Scholar 

  55. Dimas, A. S., Deutsch, S., Stranger, B. E., Montgomery, S. B., Borel, C., Attar-Cohen, H., et al. (2009). Common regulatory variation impacts gene expression in a cell type-dependent manner. Science, 325, 1246–1250.

    Article  Google Scholar 

  56. Boyle, A. P., Hong, E. L., Hariharan, M., Cheng, Y., Schaub, M. A., Kasowski, M., et al. (2012). Annotation of functional variation in personal genomes using RegulomeDB. Genome Research, 22, 1790–1797.

    Article  Google Scholar 

  57. Ward, L. D., & Kellis, M. (2012). HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Research, 40, D930–D934.

    Article  Google Scholar 

  58. Cheng, I., Chen, G. K., Nakagawa, H., He, J., Wan, P., Lurie, C., et al. (2012). Evaluating genetic risk for prostate cancer among Japanese and Latinos. Cancer Epidemiology, Biomarkers & Prevention, 21(11), 2048–2058.

    Article  Google Scholar 

  59. Kolata, G. (2012, September 15). Bits of mystery DNA, far from ‘junk,’ play crucial role. New York Times, New York, NY

    Google Scholar 

  60. Graur, D., Zheng, Y., Price, N., Azevedo, R. B. R., Zufall, R. A., & Elhaik, E. (2013). On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution, 5, 578–590.

    Article  Google Scholar 

  61. Kwok, P. Y. (2001). Methods for genotyping single nucleotide polymorphisms. Annual Review of Genomics and Human Genetics, 2, 235–258.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Stram, D.O. (2014). Introduction. In: Design, Analysis, and Interpretation of Genome-Wide Association Scans. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9443-0_1

Download citation

Publish with us

Policies and ethics